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Abstract:  Computation  based  on  genomic  data  is  becoming 
increasingly  popular  today,  be  it  for  medical  or  other  purposes. 
Non-medical  uses  of  genomic  data  in  a  computation  often  take 
place  in  a  server-mediated  setting  where  the  server  offers  the 
ability  for  joint  genomic  testing  between  the  users.  Undeni¬ 
ably,  genomic  data  is  highly  sensitive,  which  in  contrast  to 
other  biometry  types,  discloses  a  plethora  of  information  not 
only  about  the  data  owner,  but  also  about  his  or  her  relatives. 
Thus,  there  is  an  urgent  need  to  protect  genomic  data.  This  is 
particularly  true  when  the  data  is  used  in  computation  for  what 
we  call  recreational  non-health-related  purposes.  Towards  this 
goal,  in  this  work  we  put  forward  a  framework  for  server-aided 
secure  two-party  computation  with  the  security  model  moti¬ 
vated  by  genomic  applications.  One  particular  security  setting 
that  we  treat  in  this  work  provides  stronger  security  guarantees 
with  respect  to  malicious  users  than  the  traditional  malicious 
model.  In  particular,  we  incorporate  certified  inputs  into  se¬ 
cure  computation  based  on  garbled  circuit  evaluation  to  guar¬ 
antee  that  a  malicious  user  is  unable  to  modify  her  inputs  in 
order  to  learn  unauthorized  information  about  the  other  user’s 
data.  Our  solutions  are  general  in  the  sense  that  they  can  be 
used  to  securely  evaluate  arbitrary  functions  and  offer  attrac¬ 
tive  performance  compared  to  the  state  of  the  art.  We  apply 
the  general  constructions  to  three  specific  types  of  genomic 
tests:  paternity,  genetic  compatibility,  and  ancestry  testing  and 
implement  the  constructions.  The  results  show  that  all  such 
private  tests  can  be  executed  within  a  matter  of  seconds  or  less 
despite  the  large  size  of  one’s  genomic  data. 
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1  Introduction 

The  motivation  for  this  work  comes  from  rapidly  expanding 
availability  and  use  of  genomic  data  in  a  variety  of  applica¬ 
tions  and  the  need  to  protect  such  highly  sensitive  data  from 
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potential  abuse.  The  cost  of  sequencing  one’s  genome  has  dra¬ 
matically  decreased  in  the  recent  years  and  is  continuing  to 
decrease,  which  makes  such  data  more  readily  available  for  a 
number  of  applications  such  as: 

-  personalized  medicine  uses  genomic  tests  prior  to  pre¬ 
scribing  a  treatment  to  ensure  its  success; 

-  paternity  testing  uses  DNA  data  to  determine  whether  one 
individual  is  the  father  of  another  individual; 

-  genomic  compatibility  tests  allow  potential  or  current 
partners  to  determine  whether  their  future  children  are 
likely  to  inherit  genetic  conditions; 

-  determining  ancestry  and  building  genealogical  trees  is 
done  by  examining  DNA  data  of  many  individuals  and 
finding  relationships  among  specific  individuals. 

Genomic  tests  are  increasingly  used  for  medical  purposes  to 
ensure  the  best  treatment.  A  number  of  services  for  what  we 
call  the  “leisure”  use  of  DNA  data  has  flourished  as  well  (e.g., 
[1-3])  allowing  for  various  forms  of  comparing  DNA  data,  be 
it  for  building  ancestry  trees,  genomic  compatibility  or  other. 

It  is  clear  that  DNA  is  highly  sensitive  and  needs  to  be 
protected  from  abuse.  It  can  be  viewed  as  being  even  more  sen¬ 
sitive  than  other  types  of  an  individual’s  biometry,  as  not  only 
does  it  allow  for  unique  identification  of  the  individual,  but  it 
also  allows  for  learning  a  plethora  of  information  about  the  in¬ 
dividual  such  as  predisposition  to  medical  conditions  and  rela¬ 
tives  of  the  individual  thus  exposing  information  about  others 
as  well.  Furthermore,  our  understanding  of  genomes  is  contin¬ 
uously  growing  and  exposure  of  DNA  data  now  can  lead  to 
consequences  which  we  cannot  even  anticipate  today.  For  that 
reason,  a  number  of  publications  that  enable  genomic  compu¬ 
tation  while  preserving  privacy  of  DNA  data  have  appeared 
in  the  literature  (see,  e.g.,  [9-12,  15,  29]).  Such  publications 
span  several  types  of  genomic  computation  including  medical 
(such  as  personalized  medicine  and  disease  risk  computation) 
and  non-medical  applications  (such  as  paternity  testing). 

While  protecting  privacy  of  genomic  data  is  important  for 
all  applications,  in  our  opinion,  it  is  more  difficult  for  an  indi¬ 
vidual  to  influence  the  way  medical  procedures  are  conducted 
than  services  in  which  individuals  decide  to  voluntarily  par¬ 
ticipate.  That  is,  if  genetic  tests  are  necessary  for  a  patient  to 
determine  the  most  effective  treatment  and  the  procedures  in 
place  are  considered  law-compliant,  the  patient  has  little  pos¬ 
sibility  for  influencing  the  way  the  DNA  tests  are  conducted 
(besides,  perhaps,  declining  to  take  the  test  and  risking  that 
the  prescribed  generic  treatment  is  ineffective  for  her  or  has 
severe  side  effects).  With  what  we  consider  as  “leisure”  uses  of 
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DNA  information,  the  situation  is  different.  An  individual  who 
meets  a  potential  partner  through  a  gene-based  matchmaking 
online  service  (such  as  [3])  might  be  reluctant  to  share  her 
DNA  data  with  the  service  (or  the  partner)  for  the  purpose  of 
genetic  compatibility  tests.  However,  if  the  user  is  assured  that 
no  sensitive  information  about  her  DNA  will  be  revealed  to 
any  party  throughout  the  computation  other  than  the  intended 
outcome,  she  might  revisit  the  decision  to  participate  in  such 
services.  Thus,  in  the  rest  of  this  work,  when  we  refer  to  ge¬ 
nomic  computation,  we  focus  on  applications  which  are  not 
detrimental  to  the  well-being  of  an  individual  and  rather  con¬ 
sider  tests  in  which  individuals  might  choose  to  participate. 

The  first  observation  we  make  about  such  types  of  ge¬ 
nomic  computation  is  that  they  are  normally  facilitated  by 
some  service  or  a  third  party.  For  example,  both  ancestry  and 
gene-based  matchmaking  web  sites  allow  participants  to  inter¬ 
act  with  each  other  through  the  service  provider.  Such  service 
providers  serve  as  a  natural  point  for  aiding  the  individuals 
with  private  computation  on  their  sensitive  genomic  data.  In 
some  prior  publications  on  genomic  computation  (e.g.,  [12]), 
it  is  assumed  that  computation  such  as  paternity  testing  or  ge¬ 
netic  compatibility  is  run  between  a  client  and  a  server,  while 
we  believe  that  it  is  more  natural  to  assume  that  such  computa¬ 
tion  is  carried  out  by  two  individuals  through  some  third-party 
service  provider.  Thus,  in  this  work  we  look  at  private  genomic 
computation  in  the  light  of  server-mediated  setting  and  utilize 
the  server  to  lower  the  cost  of  the  computation  for  the  partic¬ 
ipants.  Throughout  this  work,  we  will  refer  to  the  participants 
as  Alice  (A),  Bob  (B),  and  the  server  (S). 

From  the  security  point  of  view,  participants  in  a  protocol 
that  securely  evaluates  a  function  are  normally  assumed  to  be 
either  semi-honest  (also  known  as  honest-but-curious  or  pas¬ 
sive)  or  malicious  (also  known  as  active).  In  our  application 
domain,  we  may  want  to  distinguish  between  different  secu¬ 
rity  settings  depending  on  how  well  Alice  and  Bob  know  each 
other.  For  example,  if  Alice  and  Bob  are  relatives  and  would 
like  to  know  how  closely  they  are  related  (i.e.,  how  closely 
their  genealogical  trees  overlap),  it  would  be  reasonable  to  as¬ 
sume  that  they  will  not  deviate  from  the  prescribed  compu¬ 
tation  in  the  attempt  to  cheat  each  other,  i.e.,  they  can  be  as¬ 
sumed  to  be  semi-honest.  On  the  other  hand,  if  Alice  and  Bob 
meet  each  other  through  a  matchmaking  web  site  and  do  not 
know  each  other  well,  it  is  reasonable  for  them  to  be  cautious 
and  engage  in  a  protocol  that  ensures  security  (i.e.,  correctness 
and  privacy)  even  in  the  presence  of  malicious  participants. 
The  server  can  typically  be  expected  not  to  deviate  from  its 
prescribed  behavior,  as  it  would  lose  its  reputation  and  con¬ 
sequently  revenue  if  any  attempts  at  cheating  become  known. 
If,  however,  adding  protection  against  server’s  malicious  ac¬ 
tions  is  not  very  costly,  it  can  also  be  meaningful  to  assume  a 
stronger  security  model. 


Another  important  consideration  from  a  security  point  of 
view  is  enforcing  correct  inputs  to  be  entered  in  the  computa¬ 
tion  when,  for  instance,  the  inputs  are  certified  by  some  author¬ 
ity.  This  requirement  is  outside  the  traditional  security  model 
for  secure  multi-party  computation  (even  in  the  presence  of 
fully  malicious  actors),  and  to  the  best  of  our  knowledge  certi¬ 
fied  inputs  were  previously  considered  only  for  specific  func¬ 
tionalities  such  as  private  set  intersection  [22,  31]  or  anony¬ 
mous  credentials  and  certification  [19],  but  not  for  general  se¬ 
cure  function  evaluation  (SFE).  We  bring  this  up  in  the  context 
of  genomic  computation  because  for  certain  types  of  genomic 
tests  it  is  very  easy  for  one  participant  to  modify  his  inputs 
and  learn  sensitive  information  about  genetic  conditions  of  the 
other  party.  For  example,  genetic  compatibility  tests  evaluate 
the  possibility  of  two  potential  or  existing  partners  of  trans¬ 
mitting  to  their  children  a  genetic  disease.  Such  possibility  is 
present  when  both  partners  are  (silent)  carriers  of  that  disease 
(see  section  3.1  for  more  detail).  Then  if  the  partners  can  each 
separately  evaluate  their  DNA  for  a  fingerprint  of  a  specific 
disease,  the  joint  computation  can  consist  of  a  simple  AND  of 
the  bits  provided  by  both  parties  (for  one  or  more  conditions). 
Now  if  a  malicious  participant  sets  all  of  his  input  bits  to  1  and 
the  outcome  is  positive,  the  participant  learns  that  the  other 
party  is  a  carrier  for  a  specific  medical  condition  (or  at  least 
one  condition  from  the  set  of  specific  conditions).  We  thus 
want  to  prevent  malicious  participants  from  modifying  their 
inputs  used  in  genomic  computation  in  cases  such  data  can  be 
certified  by  certification  authorities  such  as  medical  facilities. 

In  this  work  we  address  fairness,  as  one  of  the  important 
properties  of  secure  computation.  In  particular,  it  is  known  that 
full  fairness  cannot  be  achieved  in  the  case  of  two-party  com¬ 
putation  in  the  malicious  security  model  [26],  but  it  becomes 
possible  in  the  server-aided  setting.  Fairness  has  been  con¬ 
sidered  in  the  server-aided  literature  in  the  past  [38,  47]  and 
achieving  fairness  only  adds  minimal  overhead  to  the  solutions 
in  the  settings  we  consider. 

Contributions.  While  we  draw  motivation  from  genomic 
computation,  our  results  are  general  and  can  be  applied  to  any 
function.  All  constructions  rely  on  garbled  circuit  evaluation 
typically  used  in  the  two-party  setting  (see  section  3.2),  but 
which  we  adopt  to  the  three-party  computation  between  the 
server  and  two  users.  Based  on  the  motivation  given  above, 
we  consider  different  adversarial  settings,  which  we  present 
from  the  simplest  and  enabling  most  efficient  solutions  to  the 
most  complex  with  added  security. 

1.  Our  most  efficient  solution  is  designed  for  the  setting 
where  A  and  B  are  semi-honest  (and  S  can  be  malicious), 
as  in  ancestry  testing.  In  this  setting,  the  solution  consists 
of  a  single  circuit  garbling  and  evaluation  and  the  need  for 
oblivious  transfer  (OT)  is  fully  eliminated. 
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2.  Our  second  solution  assumes  that  A  and  B  can  be  mali¬ 
cious,  but  S  is  semi-honest,  as  suitable  for  the  paternity 
test,  and  achieves  fairness  for  A  and  B.  In  this  solution, 
the  combined  work  for  all  participants  is  approximately 
the  same  as  the  combined  work  of  two  participants  in  a 
two-party  protocol  with  semi-honest  parties  only. 

3.  Our  last  solution  strengthens  the  model  of  malicious  A 
and  B  with  input  certification  (applicable  to  the  genomic 
compatibility  test).  In  more  detail,  in  addition  to  being 
able  to  behave  arbitrarily,  A  and  B  may  maliciously  mod¬ 
ify  their  true  inputs.  To  combat  this,  the  function  being 
evaluated  is  modified  to  mark  any  suitable  subset  of  the  in¬ 
puts  as  requiring  certification.  At  the  time  of  secure  func¬ 
tion  evaluation,  A  and  B  have  to  prove  that  the  inputs  they 
enter  in  the  protocol  are  identical  to  the  values  signed  by 
a  trusted  authority  (a  medical  facility  that  performs  ge¬ 
nomic  tests  in  our  case).  Achieving  this  involves  the  use 
of  additional  tools  such  as  a  signature  scheme  and  zero- 
knowledge  proofs  of  knowledge  (ZKPKs).  Handling  of 
the  remaining  inputs  and  the  rest  of  the  computation  is 
not  affected  by  the  shift  to  a  stronger  security  model. 

We  assume  that  the  participants  do  not  collude. 

All  of  our  constructions  offer  conceptual  simplicity  and 
at  the  same  time  achieve  highly  attractive  performance.  The 
strongest  of  our  models  that  enforces  input  correctness  is  novel 
and  has  not  been  treated  in  the  context  of  general  secure  multi¬ 
party  computation,  and  computation  based  on  garbled  circuits 
in  particular.  Despite  the  drastic  differences  in  the  techniques 
for  garbled  circuit  evaluation  and  data  certification,  we  show 
how  they  can  be  integrated  by  using  OT  as  the  connecting 
point  or  even  when  OT  is  not  used. 

Based  on  the  solutions  described  above,  we  build  imple¬ 
mentations  of  three  genetic  tests,  namely,  genetic  common  an¬ 
cestry,  paternity,  and  genetic  compatibility  tests.  Each  test  uses 
a  different  security  setting.  We  show  through  experimental  re¬ 
sults  that  each  test  is  efficient  with  the  worst  runtime  being 
on  the  order  of  a  couple  of  seconds.  The  performance  favor¬ 
ably  compares  to  the  state  of  the  art  (as  detailed  in  section  7), 
in  some  cases  achieving  orders  of  magnitude  performance  im¬ 
provement  over  existing  solutions. 

2  Related  Work 

Literature  on  secure  multi-party  computation  is  extensive  and 
cannot  be  covered  here.  In  what  follows,  we  concentrate  on  (i) 
secure  server-aided  two-  or  multi-party  computation  and  (ii) 
work  on  privacy-preserving  solutions  for  genetic  tests. 
Server-aided  computation.  The  closest  to  our  work  is  that 
of  Herzberg  and  Shulman  [38,  39]  that  considers  two-party 


SEE  based  on  garbled  circuits  with  the  aid  of  weakly  trusted 
servers.  The  solution  achieves  security  and  fairness  in  the  pres¬ 
ence  of  malicious  A  and  B.  The  authors  also  informally  dis¬ 
cuss  (in  [39])  extensions  to  guarantee  security  in  the  presence 
of  malicious  servers  or  collusion.  Compared  to  that  work,  our 
solution  in  the  presence  of  malicious  A  and  B  is  more  efficient 
in  that  that  [38,  39]  require  the  parties  to  perform  0{Kn)  sig¬ 
nature  verifications  and  engage  in  0{Kn)  OTs,  where  k  is  the 
security  parameter  and  n  is  the  number  of  (B’s)  inputs.  The 
server’s  work  is  also  larger  than  in  our  solution.  The  use  of 
the  server,  however,  is  more  constrained  in  [38,  39]  (i.e.,  the 
server  is  used  to  answer  queries  of  different  types,  but  it  does 
not  participate  in  interactive  computation). 

Kamara  et  al.  [47]  assume  a  different  setting,  where  a 
number  of  parties  use  a  server  to  reduce  computational  bur¬ 
den  for  some  of  them.  Using  a  solution  based  on  garbled  cir¬ 
cuits,  the  work  achieves  work  sublinear  in  the  circuit  size  for 
some  parties  and  work  polynomial  in  the  circuit  size  for  the 
remaining  parties  and  the  server.  Security  holds  when  either 
the  server  and  another  party  are  malicious  or  when  the  server 
is  semi-honest  and  all  but  one  party  are  malicious.  The  model 
relies  on  non-colluding  adversaries  (termed  non-cooperating 
adversaries  in  [46,  47]),  who  even  when  behaving  maliciously 
do  not  collude  with  others.  The  work  also  addresses  fairness. 
While  not  directly  comparable  to  our  result,  the  work  of  [47] 
uses  what  can  be  viewed  as  a  more  challenging  security  setting 
because  all  of  our  security  settings  assume  a  fixed  semi-honest 
party  and  thus  allow  for  more  efficient  constructions. 

Beye  et  al.  [14]  supplement  homomorphic  encryption 
with  server-aided  garbled  circuit  evaluation  for  a  number  of 
building  blocks  using  the  solution  from  Kamara  et  al.  [46]  in 
the  presence  of  semi-honest  participants.  The  latter  work  pro¬ 
vides  a  protocol  that  is  similar  to  our  first  solution  with  semi- 
honest  users  (in  performance  and  properties),  but  additionally 
involves  coin  tossing  that  requires  public-key  operations. 

Carter  et  al.  [24]  use  the  aid  of  a  server  to  reduce  the  cost 
of  two-party  SEE  based  on  garbled  circuits  when  any  partici¬ 
pant  can  be  malicious.  One  party  is  assumed  to  be  very  weak 
(e.g.,  a  mobile  phone),  while  the  second  participant  and  the 
server  are  more  powerful.  The  solution  lifts  most  of  the  bur¬ 
den  of  two-party  SEE  in  the  presence  of  malicious  participants 
from  the  weak  party,  but  the  work  of  the  remaining  parties  is 
still  comparable  to  the  work  in  regular  two-party  SEE  based  on 
garbled  circuits.  Carter  et  al.  [23]  improve  the  result  by  build¬ 
ing  Whitewash,  where  the  work  performed  by  the  weak  party 
is  further  reduced.  In  addition.  Mood  et  al.  [60]  present  a  solu¬ 
tion  in  a  similar  outsourced  setting  where  garbled  outputs  can 
be  mapped  to  garbled  inputs  of  another  circuit  to  save  on  both 
computation  and  communication  for  some  functions. 

Kolesnikov  et  al.  [50]  consider  the  problem  of  input  con¬ 
sistency  in  two-party  SEE  in  the  presence  of  malicious  play- 
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ers  with  the  aid  of  a  semi-honest  server.  The  goal  is  to  ensure 
that  both  A  and  B  enter  the  same  input  during  multiple  inter¬ 
actions,  which  is  enforced  with  the  help  of  the  semi-honest 
server  at  low  cost.  This  solution  is  not  suitable  for  our  goal  of 
guaranteeing  input  correctness  as  a  malicious  participant  can 
consistently  provide  incorrect  inputs  and  by  doing  so  violate 
privacy  of  possibly  multiple  users.  Furthermore,  there  may  not 
be  multiple  interactions  between  the  same  pair  of  users  to  en¬ 
force  input  consistency.  That  work  also  mentions  the  possibil¬ 
ity  of  input  certification  in  secure  two-party  computation,  but 
we  are  not  aware  of  realizations  of  this  idea. 

There  are  also  publications  [32,  43,  59]  in  the  three-party 
setting  that  utilize  garbled  circuits.  Feige  et  al.  [32]  studied 
minimal  models  for  secure  two-party  computation  and  pro¬ 
vided  constructions  for  the  setting  in  which  a  function  /  that 
produces  a  bit  is  to  be  evaluated  on  the  inputs  of  parties  A  and 
B,  but  party  C  learns  the  result.  Our  first  protocol  for  semi- 
honest  A  and  B  is  similar  to  one  of  the  constructions  sketched 
in  that  work  (see  section  5.1).  Two  other  concurrent  to  our 
work  and  independent  publications  [43,  59]  study  secure  three- 
party  computation  in  the  presence  of  a  single  malicious  party 
and  offer  efficient  constructions  based  on  garbled  circuits.  The 
solutions  provided  in  both  [43]  and  [59]  are  close  in  their  effi¬ 
ciency  to  two-party  protocols  based  on  garbled  circuits  in  the 
semi-honest  setting,  but  neither  can  achieve  fairness.  In  par¬ 
ticular,  [43]  shows  security  in  the  selective  abort  model  while 
[59]  shows  security  in  the  standard  model  with  abort. 

Lastly,  publications  such  as  [25,  44]  put  forward  generic 
constructions  for  outsourcing  secure  computation  to  multiple 
servers.  Unlike  this  work,  the  focus  is  on  enabling  clients  to 
verify  the  result  of  the  computation  with  the  overall  cost  being 
insignificantly  higher  than  the  cost  of  securely  evaluating  the 
function  itself.  [44]  considers  any  number  of  clients  and  work¬ 
ers,  while  [25]  treats  two-party  computation  that  substantially 
reduces  the  work  of  one  party  by  employing  an  extra  worker. 

We  summarize  complexity  of  constructions  from  prior 
and  our  work  in  Table  1.  In  the  table,  u  denotes  the  number 
of  non-free  gates  in  function  /,  ki  (K2)  denotes  a  security  pa¬ 
rameter  for  symmetric  (public  key)  cryptography,  ti  (^2)  is  the 
number  of  A’s  (B’s)  input  bits,  is  the  number  of  output  bits, 
ti  is  the  number  of  certified  bits  in  A’s  (B’s)  input,  and 
—  ti  (t2  =  t2  —  is  the  number  of  remaining  in¬ 
put  bits  of  A  (B),  a  and  s  are  statistical  security  parameters 
(for  the  number  of  garbled  circuits  and  encoding  bits  of  an  in¬ 
put  bit  in  the  malicious  model).  We  use  min(f,  ki)  public  key 
operations  for  t  (l-out-of-2)  OTs  with  an  OT  extension  and 
assume  K2  >  Ki.  The  function  is  more  complex  in  [38]  and 
u*  >  u.  Similarly,  u'  >  uin  [23]. 

Genomic  computation.  There  are  a  number  of  publications, 
e.g.,  [9-1 1]  and  others,  that  treat  the  problem  of  privately  com¬ 


puting  personalized  medicine  tests  with  the  goal  of  choosing 
an  optimal  medical  treatment  or  drug  prescription.  Ayday  et  al. 
[8]  also  focus  on  privacy-preserving  systems  for  storing  ge¬ 
nomic  data  by  means  of  homomorphic  encryption.  Because 
personalized  medicine  is  outside  the  scope  of  this  work,  we  do 
not  further  elaborate  on  such  solutions. 

To  the  best  of  our  knowledge,  privacy-preserving  pater¬ 
nity  testing  was  first  considered  by  Bruekers  et  al.  in  [15].  The 
authors  propose  privacy-preserving  protocols  for  a  number  of 
genetic  tests  based  on  Short  Tandem  Repeats  (STRs)  (see  sec¬ 
tion  3.1  for  detail).  The  tests  include  identity  testing,  paternity 
tests  with  one  and  two  parents,  and  common  ancestry  testing 
on  the  Y  chromosome.  The  proposed  protocols  for  these  tests 
are  based  on  additively  homomorphic  public  key  encryption 
and  are  secure  in  the  presence  of  semi-honest  participants.  Im¬ 
plementation  results  were  not  given  in  [15],  but  Baldi  et  al. 
[12]  estimates  that  the  paternity  test  in  [15]  is  several  times 
slower  than  that  in  [12].  We  thus  compare  our  paternity  test  to 
the  performance  of  an  equivalent  test  in  [12]. 

Baldi  et  al.  [12]  concentrate  on  a  different  representa¬ 
tion  of  genomic  data  (in  the  form  of  fully-sequenced  human 
genome)  and  provide  solutions  for  paternity,  drug  testing  for 
personalized  medicine,  and  genetic  compatibility.  The  solu¬ 
tions  use  private  set  intersection  as  the  primary  cryptographic 
building  block  in  the  two-party  server-client  setting.  They 
were  implemented  and  shown  to  result  in  attractive  runtimes 
and  we  compare  the  performance  of  our  paternity  and  compat¬ 
ibility  tests  to  the  results  reported  in  [12]  in  section  7. 

Related  to  that  is  the  work  of  De  Cristofaro  et  al.  [28]  that 
evaluates  the  possibility  of  using  smartphones  for  performing 
private  genetic  tests.  It  treated  paternity,  ancestry,  and  person¬ 
alized  medicine  tests.  The  protocol  for  the  paternity  test  is  the 
same  as  in  [12]  with  certain  optimizations  for  the  smartphone 
platform  (such  as  performing  pre-processing  on  a  more  power¬ 
ful  machine).  The  ancestry  test  is  performed  by  sampling  ge¬ 
nomic  data  as  using  inputs  of  large  size  deemed  infeasible  on 
a  smartphone.  The  implementation  also  used  private  set  inter¬ 
section  as  the  building  block.  Our  implementation,  however, 
can  handle  inputs  of  very  large  sizes  at  low  cost. 

Two  recent  articles  [37,  40]  describe  mechanisms  for  pri¬ 
vate  testing  for  genetic  relatives  and  can  detect  up  to  fifth  de¬ 
gree  cousins.  The  solutions  rely  on  fuzzy  extractors.  They  en¬ 
code  genomic  data  in  a  special  form  and  conduct  testing  on 
encoded  data.  The  approach  is  not  comparable  to  the  solutions 
we  put  forward  here  as  [37,  40]  are  based  on  non-interactive 
computation  and  is  limited  to  a  specific  set  of  functions. 

Although  not  as  closely  related  to  our  work  as  publica¬ 
tions  that  implement  specific  genetic  tests,  there  are  also  pub¬ 
lications  that  focus  on  applications  of  string  matching  to  DNA 
testing.  One  example  is  the  work  of  De  Cristofaro  et  al.  [30] 
that  provides  a  secure  and  efficient  protocol  that  hides  the 
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Table  1.  Complexity  of  constructions  in  prior  and  our  work. 


Party 

Communication 

Sym.  key/hash  op. 

Public  key  operations 

Security  model 

[38] , 

[39] 

A 

0(K2itl  -F  s(t2  -F  H-l))  +  ts) 

0(s(t2  -F  Kl)) 

0(k;i) 

malicious 

A  or  B, 
fairness 

B 

0(K2{ti  -F  s(k;i  -F  (2))  -F  Kj(u*  -F  ta)) 

0(u*  +  s{t2  +  Kl)) 

0(ti  -F  s(t2  -F  f^i)) 

S 

0(K,2(tl  -F  s(t2  -F  K^l))  +  -F  ta)) 

Oiu*) 

0(ti  -F  s(t2  -F  f^i)) 

[47] 

A 

0(k:i((t  ■  tx  12  ta)) 

0{a{tx  -F  t2)) 

- 

malicious  S, 
semi-honest  A  &  B 
OR  semi-honest  S, 
malicious  A  or  B, 
fairness 
in  [47]  only 

B 

0(Ki(cr(u  -F  12)  -F  ti  +  ta)) 

0(a(u  -F  ti  -F  t2)) 

- 

S 

0{Kx(a  •  M  -F  ti  -F  *2  +  ta)) 

0{(T  ■  u) 

- 

[24], 

[60] 

A 

0(K;i(crt3  -F  sti)  +  K2(cTt2  -F  ta  +  ki)) 

Oiata) 

Oicrta  +  Ki) 

B 

0{Kx{cr{stx  -F  t2  -F  ta))  -F  K,2{o't2  -F  ta  -F  ki)) 

0{ct{u  -F  sti  -F  t2  -F  ta)) 

0(<T(t2  -F  ta)  -F  Ki) 

S 

0{Kx{a{stx  -F  t2  -F  ta)  -F  K2{a-t2  -F  ta)) 

0(a(u  +  sti  +  ta)) 

0(<Tta) 

[23] 

A 

0(Ki<T(ti  -F  ta  -F  Ki)) 

0(<z(ti  -F  ta  -F  Ki)) 

- 

malicious  A  &  S 

or  malicious  B 

B 

0((k;i(<t  -F  Kx)(tx  -F  ta  -F  Ki)  -F  a-u') 
-Ft2(Kl  -F  t2)) 

0((<T  -F  K;i)(ta  -F  Ki) 
-\-<Tu'  -F  Kltl  -F  12) 

0(k;i) 

S 

0((k;i(<t  -F  -F  ta  -F  Ki)  -F  a-u') 

-Ft2(Kl  -F  t2)) 

0((<T  -F  K;i)(ti  -F  ta-F 
Kl)  -F  (t{u'  +  t2)) 

0(k;i) 

[43] 

any 

0(ki{u  +  ta)-F 

K2(ti  -F  t2  -F  min(ti  -F  t2,Ki))) 

0(u) 

0(min(ti  -F  t2,  Ki)) 

one 

malicious 

party 

[59] 

any 

0(ki(u  -F  ti  -F  t2  -F  ta)) 

0{u  -F  ti  -F  t2) 

- 

Proto¬ 
col  1 

A 

0{kx  •  ti  -F  ta) 

- 

- 

semi-honest  A  &  B, 
mal.  S,  fairness 

B,S 

0(ki(u  -F  ti  -F  t2  -F  ta)) 

0(u) 

- 

Proto¬ 
col  2 

A 

0{Kx(tx  -F  ta)) 

Oita) 

- 

malicious  A  or  B, 
semi-honest  S, 
fairness 

B 

0(ki{u  -F  t2  -F  ta)  -F  K2  ■  min(t2,Ki)) 

0{u) 

0(min(t2,  Ki)) 

S 

0(ki{u  +  tx  +t2  +  ta)  -F  K2  •  min(t2,Ki)) 

0{u  -F  ta) 

0(min(t2,  Ki)) 

Proto¬ 
col  3 

A 

0(Ki{ti  -F  ta)  -F  K2  ■  ti)) 

0(ta) 

oiti) 

as  in 

protocol  2, 
plus  certified 
inputs  for 

A  and  B 

B 

0{Kx{tx  -F  t2  -F  ta)-F 

K2(tg  -F  min(t”,Ki))) 

0(u) 

oiti+ti+ 
min(t2  ?  ^1) 

S 

0(kx(u  -F  ti  -F  t2  -F  ta) 

+K2it1  -F  t2  -F  Ki))) 

Oiu  -F  ta) 

Oit1+t1  + 
min(t2  ?  ^1)) 

size  of  the  pattern  to  be  searched  and  its  position  within  the 
genome.  Another  example  is  the  work  of  Katz  et  al.  [48]  that 
applies  secure  text  processing  techniques  to  DNA  matching. 

3  Preliminaries 

3.1  Genomic  testing 

We  next  describe  paternity,  genetic  compatibility,  and  ancestry 
tests.  Genomic  background  can  be  found  in  Appendix  A. 

Paternity  test.  This  test  is  normally  based  on  STRs  (see  Ap¬ 
pendix  A).  One’s  STR  profile  consists  of  an  ordered  sequence 
of  N  2-element  sets  S  =  *1.2},  {x2.i,X2,2},  ■■■, 

{x]sr,i,  xn,2}},  where  each  value  corresponds  to  the  number 
of  repeats  of  a  specific  STR  sequence  at  specific  locations  in 
the  genome.  For  each  STR  i,  one  of  Xi^i  and  2  is  inherited 
from  the  mother  and  one  from  the  father. 

Thus  in  the  paternity  test  with  a  single  parent,  there  are 
two  STR  profiles  S  =  {{xi^i,Xi^2})  and  S'  =  ({a;'  2}) 

corresponding  to  the  child  and  the  contested  father,  respec¬ 
tively.  To  determine  whether  S'  corresponds  to  the  child’s  fa¬ 


ther,  the  test  computes  whether  for  each  i  the  set  {xi^i,  Xi^2} 
contains  (at  least)  one  element  from  the  set  {x'^  i,a;'  2}-  In 

other  words,  the  test  corresponds  to  the  computation 

N 

/\[{xi,i,X2,i}  n  {x'i^i,x'2^^}  7^  0]  =  True  (1) 

■i— 1 

When  testing  with  both  parents  is  performed,  for  each  STR  i 
one  of  Xi^i  and  Xi^2  must  appear  in  the  mother’s  set  and  the 
other  in  the  father’s  set.  Using  both  parents’  profiles  increases 
the  accuracy  of  the  test,  but  even  the  single  parent  test  has  high 
accuracy  for  a  small  number  N  of  well-chosen  STRs  (e.g.,  the 
US  CODIS  system  utilizes  N  =  13,  while  the  European  SGM 
Plus  identification  method  uses  N  =  10). 

Genetic  compatibility  test.  Here  we  are  interested  in  the  ge¬ 
netic  compatibility  test  where  potential  (or  existing)  partners 
would  like  to  determine  the  possibility  of  transmitting  to  their 
children  a  genetic  disease  with  Mendelian  inheritance.  In  par¬ 
ticular,  if  a  specific  mutation  occurs  in  one  allele,  one  of  the 
alternative  gene  versions  at  a  given  location  (called  minor),  it 
often  has  no  impact  on  one’s  quality  of  life,  but  when  the  mu¬ 
tation  occurs  in  both  alleles  (called  major),  the  disease  man¬ 
ifests  itself  in  severe  forms.  If  both  partners  silently  carry  a 
single  mutation,  they  have  a  noticeable  chance  of  conceiving  a 
child  carrying  the  major  variety.  Thus,  a  genetic  compatibility 
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test  for  a  given  genetic  disease  would  test  for  the  presence  of 
minor  mutations  in  both  partners. 

The  current  practice  for  screening  for  most  genetic  dis¬ 
eases  consists  of  testing  one  SNP  in  a  specific  gene.  It  is,  how¬ 
ever,  expected  that  in  the  future  tests  for  more  complex  dis¬ 
eases  (that  involve  multiple  genes  and  mutations)  will  become 
available.  Thus,  a  genetic  disease  can  be  characterized  by  a 
set  of  SNP  indices  and  the  corresponding  values  (zi,  bi),  . . . , 
(it,  bt),  where  ij  is  the  SNP  index  and  bj  £  {0, 1}  is  the  value 
it  takes.  Then  if  the  same  values  are  found  in  the  appropri¬ 
ate  SNPs  of  an  individual,  the  individual  is  tested  as  positive 
(i.e.,  the  individual  is  the  disease  carrier).  If  both  partners  test 
as  positive,  then  the  outcome  of  the  genetic  compatibility  test 
will  be  treated  as  positive  and  otherwise  it  is  negative. 
Ancestry  test.  There  are  a  number  of  tests  that  allow  for  var¬ 
ious  forms  of  ancestry  testing,  for  example,  tests  using  Y- 
chromosome  STRs  (applicable  to  males  only),  mitochondrial 
DNA  (mtDNA)  test  on  the  maternal  line,  and  more  general 
SNP-based  tests  for  common  ancestry  or  one’s  genealogy. 
Many  such  tests  are  not  standardized  and  current  ancestry  and 
genealogy  service  providers  often  use  proprietary  algorithms. 
The  advantage  of  STR-based  tests  is  that  normally  only  a  rela¬ 
tively  small  number  of  STRs  are  tested,  while  SNP-based  tests 
often  utilize  a  large  number  of  (or  even  all  available)  SNPs,  but 
more  distant  ancestry  can  be  learned  from  SNP-based  tests. 
For  improved  accuracy  it  is  also  possible  to  perform  one  type 
of  testing  after  the  other.  In  either  case,  to  determine  the  most 
recent  common  ancestor  between  two  individuals,  the  markers 
from  the  two  individuals  are  compared  and  their  number  de¬ 
termines  how  closely  the  individuals  are  related.  Certain  tests 
such  as  determining  geographical  regions  of  one’s  ancestors 
normally  require  genetic  data  from  many  individuals. 

3.2  Garbled  circuit  evaluation 

The  use  of  garbled  circuits  allows  two  parties  Pi  and  P2  to  se¬ 
curely  evaluate  a  Boolean  circuit  of  their  choice.  That  is,  given 
an  arbitrary  function  fixi,X2)  that  depends  on  private  inputs 
Xi  and  X2  of  Pi  and  P2,  respectively,  the  parties  first  repre¬ 
sent  it  as  a  Boolean  circuit.  One  party,  say  Pi,  acts  as  a  circuit 
generator  and  creates  a  garbled  representation  of  the  circuit 
by  associating  both  values  of  each  binary  wire  i  (including  in¬ 
put  and  output  wires)  with  random  labels  and  ij .  The  other 
party,  P2,  acts  as  a  circuit  evaluator  and  evaluates  the  circuit  in 
its  garbled  representation  without  knowing  the  meaning  of  the 
labels  that  it  handles  during  the  evaluation.  The  output  labels 
can  be  mapped  to  their  meaning  and  revealed  to  either  or  both 
parties.  Additional  details  can  be  found  in  Appendix  A. 

The  basic  approach  is  secure  in  the  presence  of  a  semi- 
honest  circuit  generator  and  a  malicious  evaluator  [36]  (and 


the  knowledge  of  valid  labels  for  the  output  wires  implicitly 
proves  that  the  computation  was  performed  correctly  [35]). 
However,  extending  the  security  to  the  malicious  setting  (when 
either  party  can  be  malicious)  requires  additional  techniques 
which  substantially  degrade  performance  of  the  approach. 

An  important  component  of  garbled  circuit  evaluation  is 
l-out-of-2  OT.  It  allows  the  circuit  evaluator  to  obtain  wire  la¬ 
bels  corresponding  to  its  inputs.  In  particular,  in  OT  the  sender 
(i.e.,  circuit  generator  in  our  case)  possesses  two  strings  sq  and 
Si  and  the  receiver  (circuit  evaluator)  has  a  bit  a.  OT  allows 
the  receiver  to  obtain  string  Sa  and  the  sender  learns  nothing. 
An  OT  extension  allows  any  number  of  OTs  to  be  realized  with 
small  additional  overhead  per  OT  after  a  constant  number  of 
regular  more  costly  OT  protocols  (the  number  of  which  de¬ 
pends  on  the  security  parameter).  The  literature  contains  many 
realizations  of  OT  and  its  extensions,  including  recent  work, 
but  in  this  work  we  primarily  are  interested  in  OT  protocols 
and  OT  extensions  secure  in  the  presence  of  malicious  partici¬ 
pants  (such  as  [42,  61,  62]  and  others). 

The  fastest  currently  available  approach  for  circuit  gen¬ 
eration  and  evaluation  we  are  aware  of  is  by  Bellare  et  al. 
[13].  It  is  compatible  with  earlier  optimizations,  most  notably 
the  “free  XOR”  gate  technique  [52]  that  allows  XOR  gates  to 
be  processed  without  cryptographic  operations  or  communica¬ 
tion,  resulting  in  virtually  no  overhead  for  such  gates. 

3.3  Signature  schemes  with  protocols 
and  commitment  schemes 

Our  solution  that  enforces  input  correctness  by  means  of  user 
input  certification  relies  on  additional  building  blocks,  which 
are  signature  schemes  with  protocols,  commitment  schemes, 
and  zero-knowledge  proofs  of  knowledge. 

From  the  available  signature  schemes,  e.g.,  [16,  17]  with 
the  ability  to  prove  knowledge  of  a  signature  on  a  message 
without  revealing  the  message,  the  Camenisch-Lysyanskaya 
scheme  [16]  is  of  interest  to  us.  It  uses  public  keys  of  the  form 
(n,  a,  b,  c),  where  n  is  an  RSA  modulus  and  a,  b,  c  are  random 
quadratic  residues  in  Z* .  A  signature  on  message  m  is  a  tu¬ 
ple  (e,  s,  v),  where  e  is  prime,  e  and  s  are  randomly  chosen 
according  to  security  parameters,  and  v  is  computed  to  satisfy 
w®  =  aP^b’^c  (mod  n).  A  signature  can  be  issued  on  a  block 
of  messages.  To  sign  a  block  of  t  messages  mi,  . . . ,  mt,  the 
public  key  needs  to  be  of  the  form  (n,ai,  . . at,b,  c)  and  the 
signature  is  (e,  s,  v),  where  u®  =  ■  ■  ■  a'^^b’^c  (mod  n). 

Given  a  public  verification  key  (n,a,b,c),  to  prove 
knowledge  of  a  signature  (e,  s,v)  on  a  secret  message  m,  one 
forms  a  commitment  c  =  Com(m)  and  proves  that  she  pos¬ 
sesses  a  signature  on  the  value  committed  in  c  (see  [16]  for 
detail).  The  commitment  c  can  consecutively  be  used  to  prove 
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additional  statements  about  m  in  zero  knowledge.  Similarly, 
if  one  wants  to  prove  statements  about  multiple  messages  in¬ 
cluded  in  a  signature,  multiple  commitments  will  be  formed. 

The  commitment  scheme  used  in  [16]  is  that  of  Damgard 
and  Fujisaki  [27].  The  setup  consists  of  a  public  key  (n,  g,  h), 
where  n  is  an  RSA  modulus,  /i  is  a  random  quadratic  residue 
in  Z* ,  and  g  is  an  element  in  the  group  generated  by  h.  The 
modulus  n  can  be  the  same  as  or  different  from  the  modulus 
used  in  the  signature  scheme.  For  simplicity,  we  assume  that 
the  same  modulus  is  used.  To  produce  a  commitment  to  x  us¬ 
ing  the  key  (n,  g,  h),  one  randomly  chooses  r  £  Zn  and  sets 
Com  (a;,  r)  =  g^h^  mod  n.  When  the  value  of  r  is  not  essen¬ 
tial,  we  may  omit  it  and  use  Com  (a;)  instead.  This  commit¬ 
ment  scheme  is  statistically  hiding  and  computationally  bind¬ 
ing.  The  values  a;,  r  are  called  the  opening  of  Com(a;,  r). 

Zero-knowledge  proofs  of  knowledge  (ZKPKs)  allow  one 
to  prove  a  particular  statement  about  private  values  without 
revealing  additional  information  besides  the  statement  itself. 
Following  [20],  we  use  notation  PK{(vars)  :  statement}  to 
denote  a  ZKPK  of  the  given  statement,  where  the  values  ap¬ 
pearing  in  the  parentheses  are  private  to  the  prover  and  the 
remaining  values  used  in  the  statement  are  known  to  both 
the  prover  and  verifier.  If  the  proof  is  successful,  the  veri¬ 
fier  is  convinced  of  the  statement  of  the  proof.  For  example, 
PK{{a)  :  y  =  g^  W  y  =  g^}  denotes  that  the  prover  knows 
the  discrete  logarithm  of  y  to  either  the  base  gi  or  g2  ■  Lastly, 
because  a  proof  of  knowledge  of  a  signature  is  cumbersome 
to  write  in  this  detailed  form,  we  use  abbreviation  Sig(a;)  and 
Com  (a;)  to  indicate  the  knowledge  of  a  signature  and  com¬ 
mitment,  respectively.  For  example,  PK{{a)  :  Sig(Q)  Ay  = 
Com  (a)  A(a  =  0Va  =  1)}  denotes  a  proof  of  knowledge 
of  a  signature  on  a  bit  committed  to  in  y.  Because  proving  the 
knowledge  of  a  signature  on  a;  in  [16]  requires  a  commitment 
to  X  (which  is  either  computed  as  part  of  the  proof  or  may 
already  be  available  from  prior  computation),  we  explicitly  in¬ 
clude  the  commitment  into  all  proofs  of  a  signature. 

4  Security  Model 

We  formulate  security  using  the  standard  ideal/real  model  for 
secure  multi-party  computation,  where  the  view  of  any  adver¬ 
sary  in  the  real  protocol  execution  should  be  indistinguishable 
from  its  view  in  the  ideal  model  where  a  trusted  party  (TP) 
evaluates  the  function.  Because  the  server  does  not  contribute 
any  input,  it  is  meaningful  to  consider  that  either  A  or  B  is 
honest  since  the  goal  is  to  protect  the  honest  party. 

As  previously  mentioned,  we  are  primarily  interested  in 
the  setting  where  the  server  is  semi-honest,  but  parties  A  and  B 
may  either  be  semi-honest  or  fully  malicious.  Thus,  we  target 


security  models  where  S  complies  with  the  computation,  with 
the  exception  of  the  first  setting  of  semi-honest  A  and  B,  where 
we  get  security  in  the  presence  of  a  malicious  server  for  free. 
We  similarly  assume  that  the  server  will  not  collude  with  users 
(putting  its  reputation  at  risk)  or  let  users  affect  its  operation. 

We  obtain  security  settings  where  (1)  A  and  B  can  be  cor¬ 
rupted  by  a  semi-honest  adversary,  while  S  can  act  on  behalf 
of  a  fully  malicious  adversary  and  (2)  A  and  B  can  be  mali¬ 
cious,  but  the  server  is  semi-honest.  Because  we  assume  that 
the  parties  (or  the  adversaries  who  corrupt  them)  do  not  col¬ 
lude,  at  any  given  point  of  time  there  might  be  multiple  adver¬ 
saries,  but  they  are  independent  of  each  other.  This  is  similar 
to  the  setting  used  in  [46,  47].  We  note  that  based  on  the  se¬ 
curity  settings  listed  above,  at  most  one  adversary  would  be 
fully  malicious.  In  other  words,  if  in  (2)  A  is  malicious,  the 
goal  is  to  protect  B  who  is  assumed  to  not  be  malicious  and  S 
is  semi-honest,  while  in  (1)  S  can  be  malicious,  while  A  and  B 
are  semi-honest.  Kamara  et  al.  [46],  however,  show  that  in  the 
presence  of  non-cooperating  adversaries  who  corrupt  only  one 
party,  showing  security  can  be  reduced  to  showing  that  the  pro¬ 
tocol  is  secure  in  the  presence  of  semi-honest  adversaries  only, 
followed  by  proving  for  each  malicious  adversary  Ai  that  the 
solution  is  secure  in  the  presence  of  Ai  when  all  other  parties 
are  honest.  More  precisely,  we  rely  on  the  following  lemma: 

Lemma  1  ([46]).  If  a  multi-party  protocol  11  between  n  par¬ 
ties  Pi, ,  Pn  securely  computes  f  in  the  presence  of  (i)  in¬ 
dependent  and  semi-honest  adversaries  and  (ii)  a  malicious 
Ai  and  honest  }Aj}j^i,  then  II  is  also  secure  in  the  presence 
of  an  adversary  Ai  that  is  non-cooperative  with  respect  to  all 
other  semi-honest  adversaries. 

This  implies  that  in  our  setting  (2)  a  solution  secure  in  the 
presence  of  malicious  A  or  B  will  also  remain  secure  when  A 
and  B  are  corrupted  by  two  independent  malicious  adversaries. 

To  model  fairness,  we  modify  the  behavior  of  the  TP  in 
the  ideal  model  to  send  Z  to  all  parties  if  any  party  chooses 
to  abort  (note  that  fairness  is  only  applicable  to  A  and  B).  We 
assume  that  A  and  B  learn  the  result  of  evaluation  of  a  prede¬ 
fined  function  /  that  takes  input  xi  from  A  and  X2  from  B, 
and  the  server  learns  nothing.  Because  our  primary  motivation 
is  genomic  computation,  we  consider  single-output  functions, 
i.e.,  both  A  and  B  learn  f(xi ,  X2)  (but  two  of  our  constructions 
support  functions  where  A’s  and  B’s  outputs  differ  and  the  re¬ 
maining  protocol  in  the  present  form  loses  only  fairness). 

Execution  in  the  real  model.  The  execution  of  protocol  II  in 
the  real  model  takes  place  between  parties  A,  B,  S  and  a  subset 
of  adversaries  A  a,  .4s  who  can  corrupt  the  correspond¬ 
ing  party.  Let  A  denote  the  set  of  adversaries  present  in  a  given 
protocol  execution.  A  and  B  receive  their  respective  inputs  Xi 
and  a  set  of  random  coins  Vi,  while  S  receives  only  a  set  of 
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random  coins  r^.  All  parties  also  receive  security  parameter 
1”.  Each  adversary  receives  all  information  that  the  party  it 
corrupted  has  and  a  malicious  adversary  can  also  instruct  the 
corresponding  corrupted  party  to  behave  in  a  certain  way.  For 
each  Ax  G  A,  let  VIEWn.^tx  denote  the  view  of  the  adver¬ 
sary  Ax  at  the  end  of  an  execution  of  If.  Also  let  OUTj![™  de¬ 
note  the  output  of  the  honest  parties  (if  any)  after  the  same  ex¬ 
ecution  of  the  protocol.  Then  for  each  Ax  G  A,  we  define  the 
partial  output  of  a  real-model  execution  of  If  between  A,  B,  S 

def 

in  the  presence  of  A  by  REALn,^x  Xi,X2,ri,r2,r3)  = 

VIEWn..4x  U  OUTfi°:\. 

Execution  in  the  ideal  model.  In  the  ideal  model,  all  parties 
interact  with  a  TP  party  who  evaluates  /.  Similar  to  the  real 
model,  the  execution  begins  with  A  and  B  receiving  their  re¬ 
spective  inputs  Xi  and  each  party  (A,  B,  and  S)  receiving  se¬ 
curity  parameter  1"^.  Each  honest  (semi-honest)  party  sends  to 
the  TP  a;'  =  Xi  and  each  malicious  party  can  send  an  arbi¬ 
trary  value  a;'  to  the  TP.  If  a;i  or  X2  is  equal  to  ±  (empty) 
or  if  the  TP  receives  an  abort  message,  the  TP  returns  T  to 
all  participants.  Otherwise,  A  and  B  receive  /(a;']^, a;^).  Let 
OUT’;°jJ  denote  the  output  returned  by  the  TP  to  the  hon¬ 
est  parties  and  let  OUT denote  the  output  that  corrupted 
party  Ax  G  produces  based  on  an  arbitrary  function  of  its 
view.  For  each  Ax  G  the  partial  output  of  an  ideal-model 
execution  of  /  between  A,  B,  S  in  the  presence  of  A  is  denoted 
by  IDEAL («>  xi,X2)  OUT j^^x  U  0VT)°^. 

Definition  1  (Security).  A  three-party  protocol  11  between  A, 
B,  and  S  securely  computes  f  if  for  all  sets  of  probabilistic 
polynomial  time  (PPT)  adversaries  A  in  the  real  model,  for 
all  Xi  and  k  &  Z,  there  exists  a  PPT  transformation  Sx  for 
each  Ax  G  A  such  that  KEALu.Axi'^AiA2,ri,r2,r3)  w 
IDEAL /,5x  (k,  a;i ,  a;2),  where  each  ti  is  chosen  uniformly  at 

C 

random  and  w  denotes  computational  indistinguishability. 

To  model  the  setting  where  some  of  the  inputs  of  A  and/or  B 
are  certified,  we  augment  the  function  /  to  be  executed  with 
the  specification  of  what  inputs  are  to  be  certified  and  two  ad¬ 
ditional  inputs  yi  and  t/2  that  provide  certification  for  A’s  and 
B’s  inputs,  respectively.  Then  in  the  ideal  model  execution,  the 
TP  will  be  charged  with  additionally  receiving  yfs.  If  the  TP 
does  not  receive  all  inputs  or  if  upon  receiving  all  inputs  some 
inputs  requiring  certification  do  not  verify,  it  sends  ±  to  all  par¬ 
ties.  In  the  real  model  execution,  verification  of  certified  inputs 
is  built  into  11  and  besides  using  two  additional  inputs  yi  and 
t/2  the  specification  of  the  execution  remains  unchanged. 

Definition  2  (Security  with  certified  inputs).  A  three-party 
protocol  n  between  A,  B,  and  S  securely  computes  f  if  for  all 
sets  of  PPT  adversaries  A  in  the  real  model,  for  all  Xi,  yi,  and 


K  £  Z,  there  exists  a  PPT  transformation  Sx  for  each  Ax  G 
A  such  that  KEALn,Axi'^Ai,X2,yi,y2,ri,r2,r3)  ~ 
IDEALy  5^  (/t,  a;i,  a;2,  t/i,  2/2).  where  each  ti  is  chosen  uni¬ 
formly  at  random. 

5  Server-Aided  Computation 

In  this  section  we  detail  our  solutions  for  server-aided  two 
party  computation  based  on  garbled  circuits.  The  current  de¬ 
scription  is  general  and  can  be  applied  to  any  function  /.  In 
section  6  we  describe  how  these  constructions  can  be  applied 
to  genomic  tests  to  result  in  fast  performance. 

5.1  Semi-honest  A  and  B,  malicious  S 

Our  first  security  setting  is  where  A  and  B  are  semi-honest 
and  S  can  be  malicious.  The  main  intuition  behind  the  solu¬ 
tion  is  that  when  A  and  B  can  be  assumed  to  be  semi-honest 
and  a  solution  based  on  garbled  circuit  evaluation  is  used,  we 
will  charge  S  with  the  task  of  evaluating  a  garbled  circuit. 
That  is,  security  is  maintained  in  the  presence  of  malicious 
server  because  garbled  circuit  evaluation  techniques  are  secure 
in  the  presence  of  a  malicious  evaluator.  Next,  we  notice  that 
if  A  and  B  jointly  form  garbled  representation  of  the  circuit 
for  the  function  /  they  would  like  to  evaluate,  both  of  them 
can  have  access  to  the  pairs  of  labels  corresponding 

to  the  input  wires.  Thus,  they  can  simply  send  the  appropri¬ 
ate  label  to  S  for  evaluation  purposes  for  their  value  of  the 
input  bit  b  for  each  input  wire.  This  eliminates  the  need  for 
OT  and  results  in  a  solution  that  outperforms  a  two-party  pro¬ 
tocol  in  the  presence  of  only  semi-honest  participants.  The 
same  idea  was  sketched  in  [32]  (with  the  difference  that  S 
was  to  learn  the  output).  The  use  of  a  pseudo-random  function 
PRF  :  {0, 1}”  X  {0, 1}*  — i-  {0, 1}”  with  security  parameter  k 
for  deriving  wire  labels  in  the  scheme  is  as  in  [59]. 

A  more  detailed  description  of  the  solution,  which  we  de¬ 
note  as  Protocol  1,  is  given  next.  In  what  follows,  let  m  de¬ 
note  the  total  number  of  wires  in  a  circuit  (including  input  and 
output  wires),  wires  I, . .  .,ti  correspond  to  A’s  input,  wires 
/i  -F 1, . . .,  ti  3-/2  correspond  to  B’s  input,  and  the  last  ts  wires 
m  —  /3  -F  1, . . .,  m  correspond  to  the  output  wires.  We  also  use 

K  to  denote  security  parameter  (for  symmetric  key  cryptog- 
R 

raphy).  Notation  a  U  means  that  the  value  of  a  is  chosen 
uniformly  at  random  from  the  set  U.  The  protocol  is  written  to 
utilize  the  free  XOR  technique,  where  must  take  the 

same  value  A  for  all  circuit  wires  i  and  the  last  bit  of  A  is  1. 

In  Protocol  1,  the  easiest  way  for  A  and  B  to  jointly  choose 
random  values  is  for  one  party  to  produces  them  and  commu- 
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Input:  A  has  private  input  xi,  B  has  private  input  X2,  and  S  has  no 
private  input. 

Output:  A  and  B  learn  f{xi ,  X2),  S  learns  nothing. 

Protocol  1: 

1.  A  and  B  jointly  choose  S  ^  {0, 1}''“^,  k  ^  {0, 1}'',  and  set 
A  =  5||l.Theyjointlyproducempairsof  garbled  labelsast??  = 
PRF(fc,  j)  and  l\  =  £0  0  A  for  i  e  [1  ,  m],  garble  the  gates  to 
produce  garbled  circuit  Q  f  for  /,  and  send  Q  f  to  S. 

2.  For  each  i  £  [1,  fi],  A  locates  the  2th  bit  bi  of  her  input  and  sends 
to  S  the  label  of  the  corresponding  wire  i  in  the  garbled  circuit. 

3.  Similarly,  for  each  bit  j  £  [1,  f2],  B  locates  the  jfth  bit  bj  of  his 

input  and  sends  to  S  the  label  of  the  corresponding  wire 

i  1 1  in  the  garbled  circuit. 

4.  S  evaluates  the  circuit  on  the  received  inputs  and  returns  to  B  the 

computed  label  for  each  output  wire  i  £  [m  —  +  l,m].  B 

forwards  all  received  information  to  A. 

5.  For  each  returned  by  S  (f  E  [m  —  fs  +  1,  m]),  A  and  B  do  the 

following:  if  set  (2  —  m  +  f3)th  bit  of  the  output  to  0,  if 

=  ej,  set  (i  —  m  +  tsjth  bit  of  the  output  to  1,  otherwise  abort. 


nicate  to  the  other  party.  In  this  solution,  the  combined  work 
of  A  and  B  is  linear  in  the  size  of  the  circuit  for  /.  The  work, 
however,  can  be  distributed  in  an  arbitrary  manner  as  long  as 
S  receives  all  garbled  gates  (e.g.,  a  half  of  fff  from  A  and  the 
other  half  from  B).  Besides  equally  splitting  the  work  of  cir¬ 
cuit  garbling  between  the  parties,  an  alternative  possibility  is 
to  let  the  weaker  party  (e.g.,  a  mobile  phone  user)  to  do  work 
sublinear  in  the  circuit  size.  Let  A  be  a  weak  client,  who  del¬ 
egates  as  much  work  as  possible  to  B.  Then  B  generates  the 
entire  garbled  circuit  and  sends  it  to  S,  while  A  will  only  need 
to  create  ti  label  pairs  corresponding  to  her  input,  to  be  used  in 
step  2  of  the  protocol.  Upon  completion  of  the  result,  A  learns 
the  output  from  B  (i.e.,  there  is  no  need  for  A  to  know  labels 
for  the  output  wires).  Thus,  the  work  and  communication  of 
the  weaker  client  is  only  linear  in  the  input  and  output  sizes. 

Security  of  this  solution  can  be  stated  as  follows,  and  the 
proof  is  deferred  to  the  full  version  due  to  space  constraints. 

Theorem  1.  Protocol  1  fairly  and  securely  evaluates  function 
f  in  the  presence  of  semi-honest  A  and  B  and  malicious  S. 

5.2  Semi-honest  S,  malicious  A  and  B 

To  maintain  efficiency  of  the  previous  solution  by  avoiding 
the  cost  of  OT,  we  might  want  to  preserve  the  high-level  struc¬ 
ture  of  the  computation  in  the  first  solution.  Now,  however, 
because  A  and  B  can  be  malicious,  neither  of  them  can  rely  on 
the  other  party  in  garbling  the  circuit  correctly.  To  address  this, 
each  of  A  and  B  may  garble  their  own  circuit  for  /,  send  it  to 
S,  and  S  will  be  in  charge  of  evaluating  both  of  them  and  per¬ 
forming  a  consistency  check  on  the  results  (without  learning 
the  output).  With  this  solution,  A  would  create  label  pairs  for 


her  input  bits/wires  for  both  garbled  circuits  and  communicate 
one  set  of  pairs  to  B  who  uses  them  in  constructing  his  circuit. 
What  this  achieves  is  that  now  A  can  directly  send  to  S  the 
labels  corresponding  to  her  input  bits  for  circuit  evaluation  for 
both  circuits.  B  performs  identical  operations.  There  is  still  no 
need  to  perform  OT,  but  two  security  issues  arise:  (1)  A  and 
B  must  be  forced  to  provide  consistent  inputs  into  both  cir¬ 
cuits  and  (2)  regardless  of  whether  the  parties  learn  the  output 
(e.g.,  whether  the  computation  is  aborted  or  not),  a  malicious 
party  can  learn  one  bit  of  information  about  the  other  party’s 
input  (by  constructing  a  circuit  that  does  not  correspond  to  /) 
[41,  57].  While  the  first  issue  can  be  inexpensively  addressed 
using  the  solution  of  [50]  (which  works  in  the  presence  of  ma¬ 
licious  users  and  semi-honest  server),  the  second  issue  will 
still  stand  with  this  structure  of  the  computation. 

Instead  of  allowing  for  (1-bit)  information  leakage  about 
private  inputs,  we  change  the  way  the  computation  takes  place. 
If  we  now  let  the  server  garble  the  circuit  and  each  of  the  re¬ 
maining  parties  evaluate  a  copy  of  it,  the  need  for  OT  (for  both 
A  and  B’s  inputs)  arises.  We,  however,  were  able  to  eliminate 
the  use  of  OT  for  one  of  A  and  B  and  construct  a  solution  that 
has  about  the  same  cost  as  a  single  two-party  solution  in  the 
semi-honest  model.  At  a  high-level,  it  proceeds  as  follows:  A 
creates  garbled  label  pairs  for  the  wires  corresponding 

to  her  inputs  only  and  sends  them  to  S.  S  uses  the  pairs  to  con¬ 
struct  a  garbled  circuit  for  /  and  sends  it  to  B.  S  and  B  engage 
in  OT,  at  the  end  of  which  B  learns  labels  corresponding  to 
his  input  bits.  Also,  A  sends  to  B  the  labels  corresponding  to 
her  input  bits,  which  allows  B  to  evaluate  the  circuit.  We  note 
that  because  A  may  act  maliciously,  she  might  send  to  B  in¬ 
correct  labels,  which  will  result  in  B’s  inability  to  evaluate  the 
circuit.  This,  however,  is  equivalent  to  A  aborting  the  protocol. 
In  either  case,  neither  A  nor  B  learn  any  output  and  the  solu¬ 
tion  achieves  fairness.  Similarly,  if  B  does  not  perform  circuit 
evaluation  correctly,  neither  party  learns  the  output. 

The  next  issue  that  needs  to  to  addressed  is  that  of  fairly 
learning  the  output.  We  note  that  S  cannot  simply  send  the 
label  pairs  for  the  output  wires  to  A  and  B  as  this  would  al¬ 
low  B  to  learn  the  output  and  deny  A  of  this  knowledge.  In¬ 
stead,  upon  completion  of  garbled  circuit  evaluation,  B  sends 
the  computed  labels  to  A.  With  the  help  of  S,  A  verifies  that  the 
labels  A  possesses  are  indeed  valid  labels  for  the  output  wires 
without  learning  the  meaning  of  the  output.  Once  A  is  satis¬ 
fied,  she  notifies  S  who  sends  the  label  pairs  to  A  and  B,  both 
of  whom  can  interpret  and  learn  the  result.  We  note  that  mali¬ 
cious  A  can  report  failure  to  S  even  if  verification  of  the  valid¬ 
ity  of  the  output  labels  received  from  B  was  successful.  Once 
again,  this  is  equivalent  to  A  aborting  the  protocol,  in  which 
case  neither  party  learns  the  output  and  fairness  is  maintained. 

Our  solution  is  given  as  Protocol  2  and  uses  a  hash  func¬ 
tion  H  :  {0, 1}*  — ^  {0, 1}”  that  we  treat  as  a  random  oracle. 
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Input:  A  has  private  input  xi,  B  has  private  input  X2,  and  S  has  no 
private  input. 

Output:  A  and  B  learn  f{xi ,  ^2),  S  learns  nothing. 

Protocol  2: 

1.  S  chooses  S  4  {0, ki  4  {0, 1}«,  fc2  4  {0,  and 
sets  A  =  5||1.  S  sends  A  and  ki  to  A. 

2.  S  computes  wire  labels  ^9  =  PRF(A:i ,  z)  for  i  G  = 

PRF(/i;2,  i  —  ti)  for  i  G  [ti  -h  1,  m],  and  sets  0  A  for 

z  G  [1,  m].  S  then  constructs  garbled  gates  Qf  and  sends  Qf  to  B. 

3.  S  and  B  engage  in  t2  instances  of  l-out-of-2  OT,  where  S  assumes 

the  role  of  the  sender  and  uses  t2  label  pairs  for 

z  G  [1,^2]  corresponding  to  B’s  input  wires  as  its  input  and  B 
assumes  the  role  of  the  receiver  and  uses  his  t2  input  bits  bi  as  the 
input  into  the  protocol.  As  the  result  of  the  interaction,  B  learns 
garbled  labels  for  z  G  [l,f2]- 

4.  A  computes  labels  i9  =  PRF(/ci,  z)  for  z  G  [1,  fi]  and  sends  to 
B  for  her  input  bits  bi,  where  ij  =  £9  0  A  for  any  bi  =  1. 

5.  After  receiving  the  labels  for  his  own  and  A’s  input,  B  evaluates 
the  circuit,  learns  the  output  labels  for  z  G  [m  —  fa  +  1,  m] 
and  sends  them  to  A. 

6.  A  requests  from  S  output  verification  constructed  as  follows:  For 

each  output  wire  z,  S  computes  randomly  permutes 

the  tuple,  and  sends  it  to  A. 

7.  For  each  label  received  from  B  in  step  5,  A  computes  H[ti) 
and  checks  whether  the  computed  value  appear  among  F7(£9), 

received  from  S  in  step  6.  If  the  check  succeeds  for  all 
output  wires,  A  notifies  S  of  success  and  aborts  otherwise. 

8.  Upon  receiving  confirmation  of  success  from  A,  S  sends  (£9,  £|) 
for  all  output  wires  z  to  A  and  B,  who  recover  the  output. 


We  show  security  of  this  solution  in  a  hybrid  model  where  the 
parties  are  given  access  to  a  trusted  entity  computing  OT.  The 
proof  can  be  found  in  Appendix  C. 

Theorem  2.  Protocol  2  fairly  and  securely  evaluates  function 
f  in  the  presence  of  malicious  A  or  B  and  semi-honest  S  in  the 
hybrid  model  with  ideal  implementation  of  OT  and  where  H  is 
modeled  as  a  random  oracle. 

5.3  Semi-honest  S,  malicious  A  and  B 
with  input  certification 

We  next  consider  an  enhanced  security  setting  in  which  ma¬ 
licious  A  and  B  are  enforced  to  provide  correct  inputs  in  the 
computation.  This  enforcement  is  performed  hy  requiring  A 
and  B  to  certify  their  inputs  prior  to  protocol  execution  and 
prove  the  existence  of  certification  on  the  inputs  they  enter. 

The  basic  structure  of  our  solution  in  this  stronger  secu¬ 
rity  model  remains  the  same  as  in  Protocol  2,  but  we  extend  it 
with  a  novel  mechanism  for  obliviously  verifying  correctness 
of  the  inputs.  The  intricate  part  of  this  problem  is  that  signa¬ 
ture  schemes  use  public-key  operations,  while  garbled  circuit 
evaluation  deals  with  randomly  generated  labels  and  symmet¬ 


ric  key  operations.  In  what  follows,  we  describe  the  intuition 
behind  our  solution  followed  by  more  detailed  explanation. 

Suppose  that  the  party  whose  inputs  are  to  be  verified  par¬ 
ticipates  in  an  OT  protocol  on  her  inputs  as  part  of  garbled  cir¬ 
cuit  evaluation  (i.e.,  the  party  is  the  circuit  evaluator  and  acts  as 
the  receiver  in  the  OT).  Then  if  we  use  the  variant  of  OT  known 
as  committed  oblivious  transfer  (COT)  (also  called  verifiable 
OT  in  some  literature),  the  party  will  submit  commitments  to 
the  bits  of  her  input  as  part  of  OT  computation  and  these  com¬ 
mitments  can  be  naturally  tied  to  the  values  signed  by  a  third 
party  authority  by  means  of  ZKPKs  (i.e.,  without  revealing 
anything  other  than  equality  of  the  signed  values  and  the  val¬ 
ues  used  in  the  commitments).  Several  COT  schemes  that  we 
examined  (such  as  in  [45,  49]),  however,  had  disadvantages  in 
their  performance  and/or  complex  setup  assumptions  (such  as 
requiring  the  sender  and  receiver  to  hold  shares  of  the  decryp¬ 
tion  key  for  a  homomorphic  public-key  encryption  scheme). 
We  thus  choose  to  integrate  input  certification  directly  with  a 
conventional  OT  protocol  by  Naor  and  Pinkas  [61]. 

Before  we  proceed  with  further  description,  we  discuss 
the  choice  of  the  signature  scheme  and  the  way  knowledge  of 
a  signature  is  proved.  Between  the  main  two  candidates  of  sig¬ 
nature  schemes  with  protocols  [16]  and  [17],  we  chose  the  one 
from  [16]  because  it  uses  an  RSA  modulus.  In  application  like 
ours,  zero-knowledge  statements  are  to  be  proved  across  dif¬ 
ferent  groups.  This  requires  the  use  of  statistically-hiding  zero- 
knowledge  proofs  that  connect  two  different  groups  through  a 
setting  in  which  the  Strong  RSA  assumption  (or,  more  gener¬ 
ally,  the  difficulty  of  eth  root  extraction)  holds  [18,  27,  34]. 
Thus,  the  public  key  of  the  third  party  certification  authority 
can  be  conveniently  used  as  the  common  setup  for  other  in¬ 
teraction  between  the  prover  and  verifier.  This  has  important 
implications  on  the  use  of  such  solutions  in  practice.  (If  multi¬ 
ple  signatures  are  issued  by  multiple  authorities,  i.e.,  medical 
facilities  in  our  application,  one  of  the  available  public  keys 
can  be  used  to  instantiate  the  common  setup.) 

Recall  that  in  Protocol  2,  B  obtains  the  labels  correspond¬ 
ing  to  his  input  from  S  via  OT,  while  A  knows  all  label  pairs 
for  her  input  wires  and  simply  sends  the  appropriate  labels  to 
B.  Now  both  of  them  have  to  prove  to  S  that  the  inputs  they 
enter  in  the  protocol  have  been  certified  by  a  certain  authority. 
For  simplicity,  in  what  follows  we  assume  that  all  of  A’s  and 
B’s  inputs  are  to  be  verified.  (If  this  is  not  the  case  and  only  a 
subset  of  the  inputs  should  be  verified,  the  computation  associ¬ 
ated  with  input  verification  described  below  is  simply  omitted 
for  some  of  the  input  bits.)  Let  us  start  with  the  verification 
mechanism  for  B,  after  which  we  treat  the  case  of  A. 

B  engages  in  the  Naor-Pinkas  OT  in  the  role  of  the  re¬ 
ceiver.  The  details  of  the  OT  protocol  are  given  in  Appendix  A. 
As  part  of  OT,  B  forms  two  keys  PKq  and  PKi,  where  PK^ 
is  the  key  that  will  be  used  to  recover  rrio-.  Thus,  if  we  want  to 
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enforce  that  a  corresponds  to  the  bit  for  which  B  has  a  signa¬ 
ture  from  a  certification  authority,  B  must  prove  that  he  knows 
the  discrete  logarithm  of  where  a  is  the  signed  bit.  More 
formally,  the  statement  B  has  to  prove  in  zero  knowledge  is 
PK{{a,/3)  :  Sig((7)  Ay  =  Com(a)  A  ((a  =  0  A  PKq  = 
g^)  V  (a  =  1  A  PKx  =  g^))}.  In  other  words,  B  has  a  signa¬ 
ture  on  0  and  knows  the  discrete  logarithm  of  PKq  to  the  base 
g  (i.e.,  constructed  PKq  as  g^)  or  B  has  a  signature  on  1  and 
knows  the  discrete  logarithm  of  PKi  to  the  same  base.  Using 
a  technically  more  precise  PK  statement  for  showing  that  a  is 
0  or  1  would  result  in  the  PK  statement  above  be  re-written  as 
PK{{a,  a,  (3)  :  Sig((7)  Ay  =  Com(cr,  a)  =  g'^h^  A  {{y  = 
A  PKq  =  g^)  V  {y/g  =  h°‘  A  PKi  =  g^))}-  We  note  that 
it  is  known  how  to  realize  this  statement  as  a  ZKPK  as  it  uses 
only  conjunction  and  disjunction  of  discrete  logarithm-based 
sub-statements  (see,  e.g.,  [21]).  Executing  this  ZKPK  would 
allow  S  to  verify  B’s  input  for  a  particular  input  wire  if  B  has  a 
signature  on  a  bit.  In  practice,  however,  a  signature  is  expected 
to  be  on  messages  from  a  larger  space  than  {0, 1}  and  thus  a 
single  signature  will  need  to  be  used  to  provide  inputs  for  sev¬ 
eral  input  wires  in  the  circuit.  This  can  be  accomplished  by,  in 
addition  to  using  a  commitment  on  the  signed  message,  creat¬ 
ing  commitments  to  the  individual  bits  and  showing  that  they 
correspond  to  the  binary  representation  of  the  signed  message. 
Then  the  commitments  to  the  bits  of  the  message  are  linked  to 
the  keys  generated  in  each  instance  of  the  OT.  More  formally, 
the  ZKPK  statement  for  a  f-bit  signed  value  would  become: 

PK{{a,ai,...,at,a,ai,...,at)  :  Sig(cr)  A  y  =  5'^/i“A 

t 

y,  =  A  . . .  A  t/t  =  g^*h^*  A  cr  =  ^  2*- V,}  (2) 

PK{{ai,at,/3i)  :  yt  =  g'^' A{{yi  =  =  g^') 

V  iy^/g  =  A  uirf)  =  /•))}•  P) 

(i)  (i) 

Notation  PKq  and  PK^  denotes  the  public  keys  used  in 
the  ith  instance  of  Naor-Pinkas  OT.  [21]  shows  how  to  prove 
that  discrete  logarithms  satisfy  a  given  linear  equation. 

Furthermore,  it  is  likely  that  signatures  will  contain  mul¬ 
tiple  messages  (e.g.,  a  genetic  disease  name  and  the  outcome 
of  its  testing).  In  those  cases,  multiple  messages  from  a  single 
signature  can  be  used  as  inputs  into  the  garbled  circuit  or,  de¬ 
pending  on  the  function  /,  there  might  be  other  arrangements. 
For  instance,  one  message  can  be  used  to  provide  inputs  into 
the  circuit  and  another  be  opened  or  partially  open.  It  is  not 
difficult  to  generalize  equations  2  and  3  to  cover  such  cases. 

We  now  can  proceed  with  the  description  of  the  mecha¬ 
nism  for  verifying  A’s  inputs.  Recall  that  for  each  bit  i  of  her 
input,  A  has  label  pairs  {£^,1})  and  later  sends  to  B  the  label 
corresponding  to  her  input  bit  bi.  As  before,  consider  first 


the  case  when  A  holds  a  signature  on  a  single  bit.  To  prove 
that  the  label  sent  to  B  corresponds  to  the  bit  for  which  she 
possesses  a  signature,  we  have  A  commit  to  the  label  £^'  and 
prove  to  S  that  either  the  commitment  is  to  £^  and  she  has  a 
signature  on  0  or  the  commitment  is  to  ij  and  she  has  a  sig¬ 
nature  on  1.  Fet  the  commitment  be  Ci  =  Com{£^\  fi).  Then 
if  verification  of  the  ZKPKs  for  each  input  bit  was  successful, 

S  forwards  each  Ci  to  B  together  with  the  garbled  circuit.  Now 
when  A  sends  her  input  label  £\'  to  B,  she  is  also  required  to 
open  the  commitment  Ci  by  sending  to  B.  B  will  proceed 
with  circuit  evaluation  only  if  Ci  =  g  i  K'  for  each  bit  i  of 
A’s  input,  where  £^'  and  are  the  values  B  received  from  A. 

More  formally,  the  statement  A  proves  to  S  in  ZK  is 
PK{{a,  a,  /3,  7)  :  Sig((T)  Ay  =  g'^h°^  Az  =  gl^h'^  A  {{y  = 
A  z/g^'i  =  K)  V  {y/g  =  h°‘  A  zjg^'i  =  h'''))}.  Similar 
to  the  case  of  B’s  input  verification,  this  ZKPK  can  be  gen¬ 
eralized  to  use  a  single  signature  with  multiple  bits  input  into 
the  circuit.  More  precisely,  the  statement  in  equation  2  remains 
unchanged,  while  the  second  statement  becomes: 

PK{{ai,ai,^i,Pi)  :  y^  =  g'^^h°''  A  Zi  =  g^^K'A 
HVi  =  h“*A2i//-  =  hP^)y{yi/g  =  h'^'AZijg^^-  =  h^‘))}. 

(4) 

We  summarize  the  overall  solution  as  Protocol  3  in  Ap¬ 
pendix  B.  Its  security  can  be  stated  as  follows: 

Theorem  3.  Protocol  3  fairly  and  securely  evaluates  function 
f  in  the  presence  of  malicious  A  or  B  and  semi-honest  S  in  the 
hybrid  model  with  ideal  implementation  of  OT  and  where  PI  is 
a  hash  function  modeled  as  a  random  oracle  and  inputs  of  A 
and  B  are  verified  according  to  definition  2. 

Because  the  structure  of  the  computation  in  Protocol  3  is  the 
same  as  in  Protocol  2  and  primarily  only  ZKPKs  have  been 
added  (that  have  corresponding  simulators  in  the  ideal  model), 
we  defer  the  proof  to  the  full  version  of  this  work. 

Before  we  conclude  this  section,  let  us  comment  on  the 
possibility  of  using  OT  extensions  in  combination  with  certi¬ 
fied  inputs.  First,  notice  that  when  only  a  subset  of  the  input 
bits  is  to  be  verified,  OT  and  OT  extensions  for  the  remaining 
input  bits  can  be  used  as  before.  Second,  if  there  is  a  compu¬ 
tational  benefit  to  using  an  OT  extension  instead  of  individual 
instances  of  OT,  the  benefit  of  an  OT  extension  is  not  going  to 
be  as  pronounced  as  in  the  regular  case  with  no  input  certifica¬ 
tion.  OT  extensions  allow  the  number  of  public  key  operations 
to  be  bounded  by  the  security  parameter  and  be  independent  of 
the  number  of  input  bits,  while  with  certified  inputs  the  num¬ 
ber  of  public  key  operations  is  inevitably  linear  in  the  number 
of  input  bits  being  verified.  For  example,  jumping  ahead  to  ex¬ 
perimental  results,  we  see  that  computation  in  Protocol  3  in 
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Table  8  is  1  to  5  orders  of  magnitude  larger  for  any  given  party 
than  in  Protocol  2  in  Table  7  due  to  the  use  of  certified  inputs. 
This  tells  us  that  employing  an  OT  extension  in  combination 
with  certified  inputs  will  have  a  negligible  effect  on  the  perfor¬ 
mance  of  Protocol  3.  Furthermore,  applicability  of  each  indi¬ 
vidual  OT  extension  mechanism  to  the  case  of  certified  inputs 
will  likely  need  to  be  considered  on  a  case  by  case  basis  and 
we  leave  this  as  a  direction  for  future  research.  Publications 
that  adapt  OT  extensions  to  new  models  (such  as  [51])  can  be 
used  as  a  starting  point,  but  do  not  easily  apply  to  our  setting. 

6  Private  Genomic  Computation 

For  all  types  of  genomic  computation  we  assume  that  A  has 
information  extracted  from  her  genome,  which  she  privately 
stores.  Similarly,  B  stores  data  associated  with  his  genome.  A 
and  B  may  enter  some  or  all  of  their  data  into  the  computation 
and  they  may  also  compute  a  function  of  their  individual  data, 
which  will  be  used  as  the  input  into  the  joint  computation. 

Ancestry  test.  This  test  would  often  be  invoked  when  A  and  B 
already  know  to  be  related  or  have  reasons  to  believe  to  be  re¬ 
lated.  Under  such  circumstances,  they  are  unlikely  to  try  to 
cheat  each  other.  For  that  reason,  we  use  the  solution  with 
semi-honest  A  and  B  to  realize  this  test.  (Under  the  circum¬ 
stances  that  this  security  model  is  not  acceptable  for  some 
users  A  and  B,  they  can  always  proceed  with  an  alternative 
solution  from  this  or  other  work,  but  we  use  the  first  protocol.) 
Because  SNP-based  tests  are  most  general  and  can  provide  in¬ 
formation  about  recent  as  well  as  distant  ancestry,  we  build  a 
circuit  that  takes  a  large  number  of  SNPs  from  two  individuals 
and  counts  the  number  of  positions  with  the  same  values.  The 
computed  value  is  then  compared  to  a  number  of  thresholds  to 
determine  the  closest  generation  in  which  the  individuals  have 
the  same  ancestor. 

To  compute  the  number  of  SNPs  which  are  equal  in  the 
DNA  of  two  individuals,  the  circuit  first  proceeds  by  XORing 
two  binary  input  vectors  from  A  and  B  (recall  that  the  value 
of  each  SNP  is  a  bit)  and  then  counts  the  number  of  bits  that 
differ  in  a  hierarchical  manner.  That  is,  in  the  first  round  of 
additions,  every  two  adjacent  bits  are  added  and  the  result  is  a 
2-bit  integer.  In  the  second  round  of  additions,  every  two  ad¬ 
jacent  results  from  the  first  round  are  added  resulting  in  3-bit 
sums.  This  process  continues  in  [log2  t  \  rounds  of  additions, 
where  t  is  the  size  of  As  and  B’s  input,  and  the  last  round  per¬ 
forms  only  a  single  addition.  As  mentioned  earlier,  the  result 
can  be  interpreted  by  performing  a  number  of  comparisons  at 
the  end,  but  the  cost  of  final  comparisons  is  insignificant  com¬ 
pared  to  the  remaining  size  of  the  circuit. 


Paternity  test.  We  assess  that  the  security  setting  with  mali¬ 
cious  users  A  and  B  is  the  most  suitable  for  running  paternity 
tests.  That  is,  the  participants  may  be  inclined  to  tamper  with 
the  computation  to  influence  the  result  of  the  computation.  It 
is,  however,  difficult  to  learn  the  other  party’s  genetic  infor¬ 
mation  by  modifying  one’s  input  into  the  function.  In  partic¬ 
ular,  recall  from  equation  1  that  the  output  of  a  paternity  test 
is  a  single  bit,  which  indicates  whether  the  exact  match  was 
found.  Then  if  a  malicious  participant  engages  in  the  compu¬ 
tation  with  the  same  victim  multiple  times  and  modifies  the 
input  in  the  attempt  to  discover  the  victim’s  genomic  data,  the 
single  bit  output  does  not  help  the  attacker  to  learn  how  his  in¬ 
puts  are  to  be  modified  to  be  closer  to  the  victim’s  input.  The 
situation  is  different  when  the  output  of  the  computation  re¬ 
veals  information  about  the  distance  between  the  inputs  of  A 
and  B,  but  we  do  not  consider  such  computation  in  this  work. 
Thus,  we  do  not  use  input  certification  for  paternity  tests. 

This  test  would  normally  be  run  between  an  individ¬ 
ual  and  a  contested  father  of  that  individual  according  to 
the  computation  in  equation  1.  We  thus  implement  the  com¬ 
putation  in  equation  1  using  a  Boolean  circuit.  For  each 
i,  the  circuit  XORs  the  vectors  and 

(a;'  a;'  a;'  2j  a;'  2)  and  compares  each  of  the  four  value  in 

the  resulting  vector  to  0.  The  (in)equality  to  0  testing  is  per¬ 
formed  using  fc  —  1  OR  gates,  where  k  is  the  bitlength  of  all 
Xij’s  and  a;'  ^-’s.  Finally,  we  compute  the  AND  of  the  results 
of  the  4  equality  tests,  OR  the  resulting  bits  across  i’s,  and 
output  the  complement  of  the  computed  bit. 

Genetic  compatibility  test.  When  A  and  B  want  to  perform 
a  compatibility  test,  we  assume  that  they  want  to  evaluate  the 
possibility  of  their  children  inheriting  at  least  one  recessive  ge¬ 
netic  disease.  Thus,  we  assume  that  A  and  B  agree  on  a  list  of 
genetic  diseases  to  be  included  in  the  test  (this  list  can  be  stan¬ 
dard,  e.g.,  suggested  by  S  or  a  medical  association).  Because 
performing  a  test  for  a  specific  genetic  disease  is  only  mean¬ 
ingful  if  both  parties  wish  to  be  tested  for  it,  we  assume  that  A 
and  B  can  reconcile  the  differences  in  their  lists. 

To  maximize  privacy,  we  construct  the  function  /  to  be  as 
conservative  as  possible.  In  particular,  given  a  list  of  genetic 
diseases  L,  A  and  B  run  a  compatibility  test  for  each  disease 
D  a  L,  and  if  at  least  one  test  resulted  in  a  positive  outcome, 
the  function  will  output  1,  and  otherwise  it  will  output  0.  That 
is,  the  function  can  be  interpreted  as  producing  1  if  A  and  B’s 
children  have  a  chance  of  inheriting  the  major  variety  for  at 
least  one  of  the  tested  diseases;  and  producing  0  means  that 
their  children  will  not  inherit  the  major  variety  for  any  of  the 
diseases  in  L.  Evaluating  this  function  can  be  viewed  as  the 
first  step  in  A  and  B’s  interaction.  If  the  output  was  I,  they  may 
jointly  decide  to  run  more  specific  computation  to  determine 
the  responsible  disease  or  diseases  themselves. 
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The  above  means  that  for  each  D  ^  L,  A  can  locally  run 
the  test  to  determine  whether  she  is  a  carrier  of  D.  B  performs 
the  same  test  on  his  data.  Thus,  A’s  and  B’s  input  into  /  con¬ 
sists  of  |L|  bits  each  and  the  result  is  1  iff  3i  such  that  A’s  and 
B’s  ith  input  bits  are  both  1.  This  computation  can  be  realized 
as  a  simple  circuit  consisting  of  |L|  AND  and  |T|  —  1  OR  gates. 

Next,  notice  is  that  it  is  easy  for  malicious  A  or  B  to  learn 
sensitive  information  about  the  other  party  by  using  certain  in¬ 
puts.  That  is,  if  a  malicious  user  sets  all  his  input  bits  to  1, 
he  will  be  able  to  learn  whether  the  other  party  is  a  carrier  of 
least  one  disease  in  L.  This  poses  substantial  privacy  concerns, 
particularly  for  matchmaking  services  that  routinely  run  ge¬ 
netic  compatibility  tests  between  many  individuals.  Thus,  we 
require  that  A  and  B  certify  the  results  of  testing  for  each  ge¬ 
netic  disease  on  the  list  (e.g.,  by  a  medical  facility)  and  enter 
certified  inputs  into  the  computation.  (Note  that  the  medical 
facility  that  performs  sequencing  can  also  certify  the  test  re¬ 
sults;  alternatively,  the  medical  facility  performing  test  certifi¬ 
cation  will  require  genome  certification  from  the  facility  that 
performed  sequencing.)  This  means  that  the  server-aided  solu¬ 
tion  with  certified  inputs  will  be  used  for  secure  computation. 

For  each  disease  D  £  L,  the  signature  will  need  to  include 
the  name  of  the  disease  D  and  the  test  outcome  a,  which  we 
assume  is  a  bit.  Then  if  we  target  efficient  the  computation,  the 
disease  names  will  not  be  input  into  the  circuit,  but  instead  S 
will  verify  that  A’s  signature  used  for  a  particular  input  wire 
includes  the  same  disease  name  as  B’s  signature  used  for  an 
equivalent  input  wire.  A  simple  way  to  achieve  this  is  to  reveal 
list  L  to  S  and  reveal  the  name  of  the  disease  including  in  each 
signature  (without  revealing  the  signature  itself).  If  we  assume 
that  each  issued  signature  is  on  the  tuple  {D,  a),  i.e.,  the  sig¬ 
nature  was  produced  as  w®  =  a^a^b^c,  all  that  is  needed  is  to 
adjust  the  value  used  in  the  ZKPK  of  the  signature  by  by 
both  the  sender  and  the  verifier  for  each  D  £  L  (we  refer  the 
reader  to  [16]  for  details).  S  will  need  to  check  that  all  condi¬ 
tions  appear  in  the  same  order  among  A’s  and  B’s  inputs  (i.e., 
the  sequences  of  diseases  are  identical)  before  proceeding  with 
the  rest  of  the  protocol.  Revealing  the  set  of  diseases  used  in 
the  compatibility  test  would  not  constitute  violation  of  privacy 
if  such  a  set  of  conditions  is  standard  or  suggested  by  S  itself. 

When,  however,  the  parties  compose  a  custom  set  of  ge¬ 
netic  diseases  for  their  genetic  compatibility  testing  and  would 
like  to  keep  the  set  private,  they  may  be  unwilling  reveal  the  set 
of  diseases  to  S.  We  propose  that  the  parties  instead  prove  that 
they  are  providing  results  for  the  same  conditions  without  re¬ 
vealing  the  conditions  themselves  to  the  server.  The  difficulty 
in  doing  so  arises  from  the  fact  that  S  interacts  independently 
with  A  and  B  (possibly  at  non-overlapping  times)  and  A  and  B 
are  not  proving  any  joint  statements  together.  Our  idea  of  prov¬ 
ing  that  inputs  of  A  and  B  correspond  to  the  same  sequence  of 
diseases  consists  of  forming  a  sequence  of  commitments  to  the 


diseases  in  L,  the  openings  of  which  are  known  to  both  A  and 
B.  That  is,  A  and  B  jointly  generate  a  commitment  to  each  dis¬ 
ease  using  shared  randomness  and  used  those  commitments  at 
the  time  of  proving  that  their  inputs  have  been  certified.  Then 
if  A  supplies  commitments  Comi,  . . . ,  Comt  and  proves  that 
the  committed  values  correspond  to  the  diseases  in  her  signa¬ 
tures,  S  will  check  that  B  supplies  the  same  sequence  of  com¬ 
mitments  and  also  proves  that  the  committed  values  are  equal 
to  the  diseases  in  the  signatures  he  possesses.  This  will  ensure 
that  A  and  B  supply  input  bits  for  the  same  sequence  of  dis¬ 
eases.  To  jointly  produce  the  commitments,  we  have  both  A 
and  B  contribute  their  own  randomness  and  the  resulting  com¬ 
mitment  will  be  a  function  of  A’s  and  B’s  contribution.  It  can 
proceed  as  follows  (recall  that  A  and  B  can  be  malicious): 

1 .  A  chooses  a  random  va  and  sends  to  B  =  Com  (ta  ,  z). 

2.  B  chooses  a  random  rg  and  sends  to  A  =  Com  (rg,  z'). 

3.  A  and  B  open  their  commitments  by  exchanging  (r^,  z) 
and  {tb,  z')  and  verify  that  they  match  ca  and  cb,  resp. 

4.  They  form  joint  randomness  as  r  =  ©  rs  and  use  it  to 

construct  commitment  Com(Z?,  r). 

Then  the  (high-level)  statement  that  A  and  B  prove  about  their 
inputs  is  PK{{a,a)  :  Sig(a,(T)  A  t/i  =  Com(a)  A  ?/2  = 
Com(a)}  using  yi  shared  between  A  and  B,  while  the  remain¬ 
ing  portion  is  specific  to  A  and  B  as  detailed  in  section  5.3. 

7  Performance  Evaluation 

In  this  section,  we  report  on  the  results  of  our  implementa¬ 
tion.  The  implementation  was  written  in  C/C++  using  Miracl 
library  [5]  for  large  number  arithmetic  and  JustGarble  library 
[4]  for  garbled  circuit  implementation.  We  provide  experimen¬ 
tal  results  for  ancestry,  paternity,  and  compatibility  tests  im¬ 
plemented  as  described  in  section  6  as  well  as  additional  func¬ 
tions,  on  all  of  which  we  further  elaborate  below.  The  security 
parameters  for  symmetric  key  cryptography  and  statistical  se¬ 
curity  were  set  to  128.  The  security  parameter  for  public  key 
cryptography  (for  both  RSA  modulus  and  discrete  logarithm 
setting)  was  set  to  1536.  Additionally,  the  security  parame¬ 
ter  for  the  group  size  in  the  discrete  logarithm  setting  was  set 
to  192.  All  tests  (for  A,  B,  and  S)  were  run  on  a  quad-core 
3.2GHz  machine  with  Intel  15-3470  processor  running  Red 
Hat  Linux  2.6.32  in  a  single  core.  Note  that  in  practice  S  is 
expected  to  have  more  powerful  hardware  and  the  runtimes 
can  be  significantly  reduced  by  utilizing  more  cores.  All  ex¬ 
periments  were  run  5  times,  and  the  mean  value  is  reported. 

To  provide  additional  insights  into  which  protocol  com¬ 
ponent  is  main  performance  bottleneck,  we  separately  report 
computation  times  for  different  parts  of  each  solution  (e.g., 
garbled  circuit  evaluation,  OT,  etc.).  Furthermore,  we  sepa- 
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Table  2.  Performance  of  ancestry  test  without/with  half-gates. 


Party 

Garbled  circuit 

Communication  | 

garble  (offline) 

eval  (online) 

sent 

received 

A 

1.8ms 

- 

2MB 

0MB 

B 

19.8/18.4ms 

- 

8/6MB 

0MB 

S 

- 

12. 5/15. 9ms 

0MB 

10/8MB 

rately  list  the  times  for  offline  and  online  computation,  where, 
as  in  other  publications,  offline  computation  refers  to  all  oper¬ 
ations  that  can  be  performed  before  the  inputs  become  avail¬ 
able.  Lastly,  because  the  speed  of  communication  channels  can 
greatly  vary,  we  separately  report  the  size  of  communication 
for  each  party  and  communication  time  is  not  included  in  the 
runtimes.  In  several  cases  overlaying  computation  with  com¬ 
munication  is  possible  (e.g.,  S  can  perform  OT  computation 
and  simultaneously  transmit  the  garbled  circuit)  and  the  over¬ 
all  runtime  does  not  need  to  be  the  sum  of  computation  and 
communication  time.  We  first  discuss  ancestry,  paternity,  and 
compatibility  tests  in  their  respective  settings  and  then  proceed 
with  evaluating  additional  functions  in  all  three  settings. 

7.1  Ancestry  test 

Recall  that  the  ancestry  test  is  implemented  in  the  setting 
where  A  and  B  are  semi-honest,  but  S  can  be  malicious.  We 
ran  this  test  using  2^^  SNPs  as  the  input  for  A  and  B.  The 
resulting  circuit  used  655,304  XOR  gates  and  131,072  non- 
XOR  gates.  The  computation  time  and  communication  size  are 
given  in  Table  2.  We  used  the  original  JustGarble  implementa¬ 
tion  as  well  as  implement  a  variant  with  the  recent  half-gates 
optimization  [63],  which  reduces  bandwidth  associated  with 
transmitting  garbled  circuits.^  Both  variants  are  listed  in  Ta¬ 
ble  2.  In  the  context  of  this  work,  the  half-gates  optimization 
has  the  largest  impact  on  the  performance  of  the  first  proto¬ 
col,  as  in  the  remaining  protocols  other  components  of  SFE 
are  likely  to  dominate  the  overall  time. 

The  implementation  assumes  that  A  only  creates  labels  for 
her  input  wires  and  communicates  2^^  labels  to  B.  B  performs 
the  rest  of  the  garbling  work  and  interacts  with  S.  As  expected, 
the  time  for  circuit  garbling  and  evaluation  is  small,  but  the 
size  of  communication  is  fairly  large  because  of  the  large  input 
size  and  consecutively  circuit  size.  Nevertheless,  we  consider 
the  runtimes  very  small  for  the  computation  of  this  size. 


1  In  both  the  original  and  half-gates  implementations,  garbling  a  non-free 
gate  involves  calling  AES  on  4  blocks,  while  evaluation  of  a  half  gate 
calls  AES  on  2  blocks  and  on  1  block  in  the  original  implementation. 
Any  deviations  in  the  ran  time  from  these  expectations  are  due  to  non¬ 
cryptographic  operations. 


To  provide  insights  into  performance  gains  of  our  solution 
compared  to  the  regular  two-party  computation  in  the  semi- 
honest  setting,  we  additionally  implement  the  garbled  circuit- 
based  approach  in  the  presence  of  semi-honest  A  and  B  only. 
In  addition  to  circuit  garbling  and  evaluation,  this  also  requires 
the  use  of  OT,  which  we  implement  using  a  recent  optimized 
OT  extension  construction  from  [7]  (including  optimizations 
specific  to  Yao’s  garbled  circuit  evaluation).  As  in  [7],  we  use 
Naor-Pinkas  OT  for  128  base  OTs  [61].  The  results  are  given 
in  Table  3.  Compared  to  the  server-aided  setting,  computation 
is  higher  by  at  least  two  orders  of  magnitude  for  each  party 
and  communication  is  noticeably  increased  as  well. 

7.2  Paternity  test 

Next,  we  look  at  the  paternity  test,  implemented  as  described 
in  section  6  in  the  presence  of  malicious  A  and  B  and  semi- 
honest  S.  The  inputs  for  both  A  and  B  consisted  of  13  2- 
element  sets,  where  each  element  is  9  bits  long.  We  use  OT 
extension  from  [7]  with  128  Naor-Pinkas  base  OTs.  The  circuit 
consisted  of  468  XOR  and  467  non-XOR  gates.  The  results  of 
this  experiment  are  reported  in  Table  4.  The  computation  for 
output  verification  is  reported  only  as  part  of  total  time.  Not 
surprisingly,  the  cost  of  OT  dominates  the  overall  runtime,  but 
for  A  the  overhead  is  negligible  (the  cost  of  generating  input 
labels  and  verifying  the  output  labels  returned  by  B).  Thus,  it 
is  well-suited  for  settings  when  one  user  is  very  constrained. 

Compared  to  two-party  computation  in  the  presence  of 
malicious  participants,  our  solution  reduces  both  computation 
and  communication  for  the  participants  by  at  least  two  orders 
of  magnitude.  This  is  because  practical  constructions  rely  on 
cut-and-choose  (and  other)  techniques  to  ensure  that  the  party 
who  garbles  the  circuit  in  unable  to  learn  unauthorized  infor¬ 
mation  about  the  other  participant’s  input.  Recent  results  such 
as  [6,  56,  58]  require  the  circuit  generator  to  garble  on  the  or¬ 
der  of  125  circuits  for  cheating  probability  of  at  most  2~^^, 
some  of  which  are  checked  (i.e.,  re-generated)  by  the  circuit 
evaluator,  while  the  remaining  circuits  are  evaluated.  Thus,  the 
work  of  each  of  A  and  B  will  have  to  increase  by  at  least  two 
orders  of  magnitude  just  for  circuit  garbling  and  evaluation, 
not  counting  other  techniques  that  deter  a  number  of  known 
attacks  and  result  in  increasing  the  input  size  and  introducing 
expensive  public  key  operations.  A  notable  exception  to  the 
above  is  the  work  of  Lindell  [54]  that  reduces  the  number  of 
circuits  to  40  for  the  same  cheating  probability.  The  construc¬ 
tion,  however,  results  in  savings  only  for  circuits  of  large  size 
as  it  introduces  a  large  number  of  additional  public  key  opera¬ 
tions.  Thus,  for  paternity  tests  using  constructions  with  a  larger 
number  of  circuits  is  very  likely  to  be  faster  in  practice,  which 
results  in  a  drastic  difference  between  our  solution  and  regular 
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Table  3.  Performance  of  ancestry  test  without  server  without/with  half-gates. 


Party 

Garbled  circuit 

OT 

Total  time 

Comm  1 

garble 

eval 

offline 

online 

offline 

online 

sent 

received 

A 

21.6/20.2ms 

- 

195.2ms 

1983ms 

216.8/215.4ms 

1983ms 

10.02/8.02MB 

2.03MB 

B 

- 

12.5/15.9ms 

2003ms 

218.6ms 

2003ms 

231.1/234.5ms 

2.03MB 

10.02/8.02MB 

Table  4.  Performance  of  paternity  test  (no  half-gates);  work  is  in 
ms,  communication  is  in  KB. 


Par¬ 

ty 

GC 

OT 

Total  time 

Comm  1 

garble 

eval 

offline 

online 

offline 

online 

sent 

recvd 

A 

0.003 

- 

- 

- 

0.003 

- 

3.7 

0.06 

B 

- 

0.01 

515.5 

201.7 

515.5 

201.7 

31.67 

56.88 

S 

0.03 

“ 

196.1 

260.9 

196.1 

260.9 

53.32 

31.66 

two-party  protocol  with  malicious  participants.  This  difference 
in  performance  can  be  explained  by  the  fact  that  in  our  setting 
one  party  is  known  not  to  deviate  from  the  protocol  allowing 
for  a  more  efficient  solution.  We  also  provide  a  comparison  to 
recent  general  three-party  constructions  in  section  7.4. 

Baldi  et  al.  [12]  also  provide  a  private  paternity  test  in 
the  two-party  setting  (between  a  client  and  a  server).  It  uses  a 
different  computation  based  on  Restriction  Fragment  Length 
Polymorphisms  (RFLPs)  and  relies  on  private  set  intersec¬ 
tion  as  a  cryptographic  building  block.  Both  offline  and  online 
times  for  the  client  and  the  server  are  3.4  ms  and  the  communi¬ 
cation  size  is  3KB  for  the  client  and  3.5KB  for  the  server  when 
the  test  is  performed  with  25  markers.  All  times  and  communi¬ 
cation  sizes  double  when  the  test  is  run  with  50  markers.  While 
the  runtimes  we  report  are  higher,  the  implementation  of  [12] 
did  not  consider  malicious  participants.  If  protection  against 
malicious  A  and  B  in  our  solution  is  removed,  the  work  for  all 
parties  reduces  to  well  below  0. 1  millisecond  and  communica¬ 
tion  becomes  a  couple  of  KBs. 

7.3  Genetic  compatibility  test 

The  last  genetic  compatibility  test  is  run  in  the  setting  where 
A  and  B  are  malicious  and  their  inputs  must  be  certified.  We 
choose  the  variant  of  the  solution  that  reveals  the  list  of  dis¬ 
eases  L  to  the  server  (i.e.,  a  standard  list  is  used).  We  im¬ 
plement  the  signature  scheme,  OT,  and  ZKPKs  as  described 
earlier.  All  ZKPKs  are  non-interactive  using  the  Fiat-Shamir 
heuristic  [33].  We  used  \L\  =  10  and  thus  A  and  B  provide  10 
input  bits  into  the  circuit  accompanied  by  10  signatures.  The 
circuit  consisted  of  only  19  non-XOR  gates.  The  performance 
of  the  test  is  given  in  Table  5.  We  divide  all  ZKPKs  into  a  proof 
of  signature  possession  (together  with  a  commitment),  denoted 
by  “Sign  PK”  in  the  table,  and  the  remaining  ZK  proofs,  de¬ 


noted  by  “Other  PK.”  As  it  is  clear  from  the  table,  input  cer¬ 
tification  contributes  most  of  the  solution’s  overhead,  but  it  is 
still  on  the  order  of  1-3  seconds  for  all  parties. 

As  mentioned  earlier,  we  are  not  aware  of  general  re¬ 
sults  that  achieve  input  certification  for  comparison.  However, 
the  comparison  to  general  two-party  computation  in  the  pres¬ 
ence  of  malicious  parties  or  server-aided  two-party  computa¬ 
tion  from  sections  7.2  and  7.4  applies  here  as  well. 

Baldi  et  al.  [12]  also  build  a  solution  and  report  on  the 
performance  of  genetic  compatibility  test.  In  [12],  testing  for 
presence  of  a  genetic  disease  that  client  carries  in  the  server 
genome  consists  of  the  client  providing  the  disease  fingerprint 
in  the  form  of  {nucleotide,  location)  pairs  (which  is  equiva¬ 
lent  to  a  SNP)  and  both  parties  searching  whether  the  disease 
fingerprint  also  appears  in  the  server’s  DNA.  This  requires 
scanning  over  the  entire  genome,  which  our  solution  avoids. 
As  a  result,  the  solution  of  [12]  incurs  substantial  offline  over¬ 
head  for  the  server  (67  minutes)  and  large  communication  size 
(around  4GB)  even  for  semi-honest  participants.  The  solution 
utilizes  authorized  private  set  intersection,  which  allows  inputs 
of  one  party  (as  opposed  to  both  in  our  work)  to  be  verified. 
Compared  to  [12],  in  our  framework,  testing  for  a  single  dis¬ 
ease  requires  a  fraction  of  a  second  for  each  party  with  mali¬ 
cious  A  and  B,  where  inputs  of  both  of  them  are  certified.  The 
computation  is  greatly  simplified  because  the  list  of  diseases  is 
assumed  to  be  known  by  both  users.  When  this  is  the  case,  the 
cost  of  input  certification  greatly  dominates  the  overall  time. 

7.4  Additional  functions 

To  better  understand  performance  of  our  solutions  for  a  variety 
of  functions,  we  next  present  the  results  of  evaluating  a  number 
of  functionalities  in  all  three  settings  put  forward  in  this  work. 
We  evaluate  AES  (standard  test),  hamming  distance  (addition- 
heavy),  matrix  multiplication  (multiplication-heavy),  and  edit 
distance  (comparison-heavy)  as  representative  functions  used 
in  related  literature.  Tables  6,  7,  and  8  provide  performance  re¬ 
sults  for  protocols  1,  2,  and  3,  respectively,  (no  half  gates)  and 
table  10  in  Appendix  D  reports  on  the  performance  of  protocol 
1  with  half-gates  (recall  that  this  optimization  has  the  most  im¬ 
pact  on  the  first  protocol).  The  inputs  are  either  n-bit  strings, 
n  X  n  matrices  of  32-bit  integers,  or  n-bit  strings  of  8-bit  char¬ 
acters  for  the  choice  of  n  listed  in  the  tables. 
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Table  5.  Performance  of  compatibility  test  (no  half-gates). 


Party 

Garbled  circuit 

OT 

Sign  PK 

Other  PK 

Total  time 

Comm  1 

garble 

eval 

offline 

online 

offline 

oniine 

offline 

online 

offline 

online 

sent 

received 

A 

0ms 

- 

- 

- 

1170ms 

42.1ms 

616.8ms 

20.6ms 

1790ms 

62.7ms 

34.35KB 

0.06KB 

B 

- 

O.OOlms 

15.4ms 

14.6ms 

1170ms 

42.1ms 

282.4ms 

15.7ms 

1470ms 

72.4ms 

36.41KB 

2.98KB 

S 

0.003ms 

- 

29.3ms 

15.2ms 

0 

2060ms 

0ms 

756ms 

29.3ms 

2830ms 

2.87KB 

70.59KB 

All  of  [23,  24,  38,  47,  60]  provide  server-aided  secure 
computation  schemes  for  general  functionalities  that  could  be 
compared  to  our  constructions.  The  solution  of  [38]  has  the 
closest  setting  to  our  work  (protocol  2)  that  assumes  two  mali¬ 
cious  users  and  a  semi-honest  server.  While  no  implementation 
was  provided  in  [38],  the  construction  of  [38]  would  require  A 
and  B  to  verify  on  the  order  of  0{Kn)  signatures  and  engage 
in  0{Kn)  OTs  (where  n  is  the  number  of  input  bits),  which 
translates  into  tens  of  thousands  of  public  key  operations  even 
for  simple  functions  and  significantly  larger  volume  of  com¬ 
munication  than  in  protocol  2.  The  server’s  work  in  [38]  is 
larger  than  in  our  solution  as  well.  The  way  the  server  is  used, 
however,  is  more  constrained  than  in  our  work. 

Whitewash  [23]  improves  on  the  result  of  Carter  et  al.  [24] 
and  thus  we  include  performance  comparison  only  for  the  for¬ 
mer.  In  addition,  Mood  et  al.  [60]  improves  on  the  result  of 
[24]  by  allowing  a  garbled  value  from  one  circuit  to  be  trans¬ 
formed  to  a  garbled  input  of  another  circuit,  thus  allowing  for 
computation  to  be  performed  in  stages  and  reusing  values  from 
one  circuit  in  a  different  circuit.  This  allows  for  savings  associ¬ 
ated  with  input  transfer  and  validation.  However,  for  the  func¬ 
tions  we  report  in  this  section  (such  as  matrix  multiplication 
and  edit  distance),  [60]  did  not  show  observable  improvement 
in  runtime  per  circuit  when  multiple  circuits  are  executed  in¬ 
stead  of  a  single  circuit.  Thus,  we  do  not  provide  a  direct  com¬ 
parison  of  our  results  with  the  performance  of  [60].  The  tech¬ 
niques  of  [60]  appear  to  be  most  effective  for  circuits  with  a 
high  ratio  of  input  bits  to  the  circuit  size. 

Because  of  the  limitations  of  the  underlying  PCF  compiler 
[53]  on  which  Whitewash  builds,  we  were  able  run  White¬ 
wash  only  on  the  circuits  included  with  the  tool.  In  particular, 
we  were  unable  to  run  AES  and  edit  distance  experiments,  as 
well  as  matrix  multiplication  for  4  x  4  matrices.  The  details 
of  Whitewash  performance  on  the  same  setup  as  in  our  other 
experiments  are  provided  in  table  9.  The  security  setting  of 
Whitewash  is  the  closest  to  our  protocol  2.  If  we  then  compare 
the  overhead  in  tables  7  and  9,  we  see  that  computation  time  is 
at  least  3  orders  of  magnitude  less  in  protocol  2  for  all  parties 
(phone  in  Whitewash  corresponds  to  our  party  A)  and  com¬ 
munication  is  2  to  3  orders  of  magnitude  less  in  protocol  2. 
Our  savings  are  possible  because  the  security  model  of  [60]  is 
more  challenging  (where  any  participating  party  can  act  mali¬ 


ciously).  Furthermore,  the  goal  of  [60]  was  not  to  optimize  the 
overall  work,  but  rather  lower  the  overhead  of  the  weak  party. 

Kamara  et  al.  [47]  uses  server-aided  computation  with  any 
number  of  participants  in  a  somewhat  different  security  model. 
The  authors  of  [47]  we  unable  to  share  their  implementation 
with  us  and  thus  we  compare  performance  with  the  numbers 
reported  in  [47].  In  the  2-party  plus  server  setting,  [47]  reports 
simplified  AES  performance  (without  key  expansion)  on  the 
order  of  40  seconds  and  hundreds  of  MBs  in  communication, 
while  we  obtain  about  1  second  total  time  and  communication 
less  than  0.5MB.  For  50-character  edit  distance,  [47]  reports 
240  seconds  runtime  with  over  1GB  communication,  while  we 
achieve  on  the  order  of  1  second  runtime  with  24MB  commu¬ 
nication  for  64-character  strings.  Once  again,  the  performance 
gap  can  be  justified  by  the  differences  in  the  security  model. 

8  Conclusions 

This  work  is  motivated  by  the  need  to  protect  sensitive  ge¬ 
nomic  data  when  it  is  used  in  computation,  especially  in  vol¬ 
untary  non-health  related  computation.  Because  computation 
over  one’s  genome  often  happens  in  server-facilitated  settings, 
we  study  server-aided  secure  two-party  computation  in  a  num¬ 
ber  of  security  settings.  One  of  such  security  settings  assumes 
that  users  A  and  B  may  act  arbitrarily  and,  in  addition  to  re¬ 
quiring  security  in  the  presence  of  malicious  users,  we  also  en¬ 
force  that  A  and  B  enter  their  true  inputs  based  on  third  party 
certification.  We  are  not  aware  of  any  prior  work  that  combines 
input  certification  with  general  secure  multi-party  computation 
based  on  Yao’s  garbled  circuits.  We  develop  general  solutions 
in  our  server-aided  framework.  Despite  their  generality,  they 
lead  to  efficient  implementations  of  genetic  tests.  In  particular, 
we  design  and  implement  genetic  paternity,  compatibility,  and 
common  ancestry  tests,  all  of  which  run  in  a  matter  of  seconds 
or  less  and  favorably  compare  with  the  state  of  the  art. 
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Table  6.  Performance  of  protocol  1  (no  half-gates);  work  is  in  ms. 


Function 

Input 

size 

Par 

ty 

Computation 

Communication  | 

oHI. 

onl. 

total 

sent 

recvd 

total 

AES 

128 

A 

0.02 

- 

0.02 

2KB 

0.01KB 

2KB 

B 

0.46 

- 

0.46 

272KB 

0.11KB 

272KB 

S 

- 

0.15 

0.15 

0.11KB 

274KB 

274KB 

Hamming 

distance 

(bits) 

212 

A 

0.06 

- 

0.06 

64KB 

0 

64KB 

B 

0.34 

- 

0.34 

256KB 

0.2KB 

256KB 

S 

- 

0.17 

0.17 

0.19KB 

320KB 

320KB 

213 

A 

0.11 

- 

0.11 

128KB 

0 

128KB 

B 

0.73 

- 

0.73 

507KB 

0.2KB 

507KB 

S 

- 

0.38 

0.38 

0.2KB 

635KB 

636KB 

214 

A 

0.22 

- 

0.22 

256KB 

0 

256KB 

B 

1.83 

- 

1.83 

1.0MB 

0KB 

1.0MB 

S 

- 

1.1 

1.1 

0.4KB 

1.25MB 

1.25MB 

Matrix 

multipli¬ 

cation 

{n  X  n 

ints) 

4 

A 

0.01 

- 

0.01 

8KB 

0.06KB 

8KB 

B 

14.5 

- 

14.5 

9.0MB 

8KB 

9.0MB 

S 

- 

7.49 

7.49 

8KB 

9.0MB 

9.0MB 

8 

A 

0.03 

- 

0.03 

32KB 

0.25KB 

32KB 

B 

116 

- 

116 

72MB 

32KB 

72MB 

S 

- 

59.7 

59.7 

32KB 

72MB 

72MB 

16 

A 

0.11 

- 

0.11 

128KB 

1KB 

129KB 

B 

926 

- 

926 

576MB 

128KB 

576MB 

S 

- 

476 

476 

128KB 

576MB 

576MB 

Edit 

distance 

(chars) 

32 

A 

0.00 

- 

0.00 

4KB 

0 

4KB 

B 

9.52 

- 

9.52 

61MB 

0.1KB 

61MB 

S 

- 

3.33 

3.33 

0.08KB 

61MB 

61MB 

64 

A 

0.01 

- 

0.01 

8KB 

0 

8KB 

B 

38.1 

- 

38.1 

24MB 

0.9KB 

24MB 

S 

- 

13.3 

13.3 

0.9KB 

24MB 

24MB 

128 

A 

0.01 

- 

0.01 

16KB 

0 

16KB 

B 

153 

- 

153 

97.5MB 

0.11KB 

97.5MB 

S 

- 

53.5 

53.5 

0.11KB 

97.4MB 

97.4MB 
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Table  8.  Performance  of  protocol  3  (no  half-gates);  work  is  in  sec 
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A  Additional  Background 

Genomic  background.  Genomes  represent  complete  heredi¬ 
tary  information  of  an  individual.  Information  extracted  from 
one’s  genome  can  take  different  forms.  One  type  is  called  Sin¬ 
gle  Nucleotide  Polymorphisms  (SNPs),  each  of  which  corre¬ 
sponds  to  a  well  known  variation  in  a  single  nucleotide  (a  nu¬ 
cleotide  can  be  viewed  as  a  simple  unit  represented  by  a  letter 
A,  C,  G,  or  T).  Because  SNP  mutations  are  often  associated 
with  how  one  develops  diseases  and  responds  to  treatments, 
they  are  commonly  used  in  genetic  disease  and  disorder  test¬ 
ing.  The  same  set  of  SNPs  (i.e.,  nucleotides  in  the  same  po¬ 
sitions)  would  be  extracted  for  each  individual,  but  the  values 
associated  with  each  SNP  differ  from  one  individual  to  an¬ 
other.  Normally  each  SNP  is  referenced  by  a  specific  index 
and  its  value  in  a  individual  is  represented  as  a  bit,  while  rep¬ 
resentations  consisting  of  3  values  0,  1,  2  are  used  as  well. 
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Input:  Sender  S  has  two  strings  mo  and  mi,  receiver  R  has  a  bit  cr. 
Common  input  consists  of  prime  p,  generator  g  of  subgroup  of  Z*  of 
prime  order  q,  and  a  random  element  C  from  the  group  generated  by  g 
(chosen  by  S). 

Output:  R  learns  rua  and  S  learns  nothing. 

OT  Protocol: 

1.  S  chooses  random  r  £  Zq  and  computes  and  g'^ . 

2.  R  chooses  k  £  Z*,  sets  public  keys  PKcr  =  g^  and  PKi-a  = 
C fPKa,  and  sends  PKq  to  S. 

3.  After  receiving  PKq,  S  computes  (PKqY  and 
{PKiY  =  C^/{PKoY.  S  sends  to  R  g^  and  two  en¬ 
cryptions  H((PKoY ^0)  0  mo  and  H{{PKi)^ ,1)  0  mi, 
where  H  is  a.  hash  function  (modeled  as  a  random  oracle). 

4.  R  computes  H{{g'^)^)  =  H({PK(jY)  and  uses  it  to  recover 
mo-. 

Fig.  1.1  -out-of-2  Oblivious  Transfer  of  [61]. 

Another  type  of  data  extracted  from  a  genome  is  based  on 
Short  Tandem  Repeats  (STRs).  STRs  occur  when  a  short  re¬ 
gion  consisting  of  two  or  more  nucleotides  is  repeated  and  the 
occurrences  are  adjacent  to  each  other.  Unrelated  individuals 
are  likely  to  have  a  different  number  of  repeats  of  a  given  STR 
sequence  in  certain  regions  in  their  DNA  and  thus  STRs  are  of¬ 
ten  used  for  identity  testing  or  testing  between  close  relatives 
(such  as  paternity  testing). 

Garbled  circuit  evaluation.  The  basic  idea  behind  garbled 
circuit  evaluation  is  as  follows  (here  we  present  only  an 
overview  of  the  approach  and  refer  the  reader  to,  e.g.,  [55]  for 
technical  details  and  security  analysis):  For  each  wire  i  of  the 
Boolean  circuit  corresponding  to  /,  the  circuit  generator  cre¬ 
ates  a  pair  of  randomly  chosen  labels  and  t\  (of  sufficient 
length  that  depends  on  the  security  parameter)  which  map  to 
the  values  of  0  and  1,  respectively,  of  this  wire.  Let  5  be  a  bi¬ 
nary  gate  that  takes  two  input  bits  and  produces  a  single  bit; 
also  let  the  input  wires  to  g  have  indices  i  and  j  and  let  the 
output  wire  have  index  k.  Then  to  create  a  garbled  represen¬ 
tation  of  the  gate,  the  circuit  generator  produces  a  truth  table 
containing  four  entries  of  the  form  Enc^i,.  Here 

i  '  j 

bi,  bj  G  {0, 1}  are  input  bits  into  the  gate  and  all  entries  in  the 
table  are  randomly  permuted.  Possession  of  two  input  labels 
and  for  any  given  values  of  hi  and  bj  will  allow  for 
recovery  of  the  corresponding  output  label  without 

revealing  anything  else.  Then  upon  garbling  all  gates  of  the 
circuit,  the  circuit  generator  communicates  all  garbled  gates, 
to  which  we  collectively  refer  as  a  garbled  circuit  (//,  to  the 
circuit  evaluator  together  with  a  single  label  for  each  input 
wire  i  according  to  the  input  bit  bi.  The  labels  corresponding  to 
the  input  wires  of  the  circuit  generator  are  simply  transmitted 
to  the  evaluator,  while  the  labels  corresponding  to  the  inputs 
of  the  circuit  evaluator  are  communicated  to  the  evaluator  by 
the  means  of  OT  (see  section  3.2).  The  knowledge  of  the  input 


labels  and  garbled  gates  allows  the  circuit  evaluator  to  evalu¬ 
ate  the  entire  circuit  in  its  garbled  representation  and  obtain  a 
label  for  each  output  wire  representing  the  output.  Then  either 
the  circuit  generator  sends  the  label  pairs  (in  order)  for  all  out¬ 
put  wires  to  the  circuit  evaluator,  which  allows  the  evaluator 
to  interpret  the  meaning  of  the  labels  and  learn  the  output,  or 
the  evaluator  sends  computed  labels  to  the  circuit  generator, 
which  in  turn  allows  the  circuit  generator  to  learn  the  result. 

Naor-Pinkas  OT  For  completeness  of  this  work,  we  provide 
Naor-Pinkas  OT  protocol  [61]  in  Figure  1. 


B  Additional  Details 

Below  we  summarize  the  overall  solution  with  certified  inputs 
in  the  presence  of  malicious  A  and  B  and  semi-honest  S  as 
Protocol  3.  For  simplicity  of  presentation,  we  assume  that  all 
input  bits  of  A  and  B  are  certified  and  signed  in  one  message. 

C  Security  Proofs 

Proof  of  Theorem  2  We  start  by  showing  fairness  and  then 
proceed  with  security.  The  only  way  for  A  or  B  to  learn  any 
output  is  when  A  is  satisfied  with  the  verification  of  the  output 
labels  she  received  from  B.  Recall  that  each  received  label  £i 
is  checked  against  for  some  bit  b,  where  H  is 

a  random  oracle.  The  probability  that  this  check  succeeds  for 
some  ii  that  is  not  equal  to  or  £j  is  negligible.  Thus,  A  is 
guaranteed  to  possess  the  result  of  garbled  circuit  evaluation, 
at  which  point  both  parties  have  access  to  the  output. 

We  next  construct  simulators  for  all  of  the  (independent) 
adversaries  A  a,  Ab,  and  .4s.  We  start  with  a  simulator  Sa  for 
malicious  A  a  ■  Sa  runs  A  a  and  simulates  the  remaining  par¬ 
ties.  A  A  produces  ti  random  labels  and  sends  them  to  Sa, 
while  Sa  chooses  A  and  sends  it  to  A  a  -  If  at  least  one  label  is 
of  an  incorrect  bitlength,  Sa  aborts.  If  Sa  did  not  abort,  A  a 
sends  ti  labels  to  Sa-  If  the  ith  label  sent  by  A  a  does  not  cor¬ 
respond  to  one  of  the  labels  in  the  ith  pair  of  labels  ©  A) 

corresponding  to  .4^’s  inputs,  Sa  aborts.  If  Sa  did  not  abort, 
it  interprets  the  meaning  of  the  input  labels  received  from  A  a 
and  stores  the  input  as  x\.  At  some  point  Sa  creates  a  ran¬ 
dom  label  £i  for  each  bit  of  the  output  and  sends  them  to  Aa- 
Upon  Aa’s  request,  Sa  also  chooses  another  random  label 
for  each  bit  of  the  output.  For  each  bit  i  of  the  output,  Sa  sends 
to  A  A  the  pair  H{£i),  H(£'^)  in  a  randomly  permuted  order.  If 
Aa  notifies  Sa  of  successful  verification  of  the  output  labels, 
Sa  queries  the  TP  for  the  output  f(x'i,X2).  For  each  ith  bit 
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Input:  A  has  private  input  xi  and  signature  Sig(xi),  B  has  private  input 
X2  and  Sig(x2),  and  S  has  no  private  input. 

Output:  A  and  B  learn  f{xi  ,3:2),  S  learns  nothing. 

Protocol  3: 

1.  (a)  S  chooses  S  -5  {0, fci  4  {0, 1}'^,  k2  4  {0, 1}'^ 

and  sets  A  =  5||1.  S  sends  A  and  ki  to  A.  S  also  computes 
labels  ^9  =  PR¥{ki,i)  andi}  =  ^9  0  A  for  i  G 

(b)  A  computes  labels  ^9  =  PRF(/ci,2)  and  ij  =  ^9  0  A 

for  i  G  [1,  fi].  For  each  bit  bi  of  her  input,  A  commits  Cj  = 
Com  (bi ,  ri )  and  =  Com  ,  r'^ )  using  fresh  randomness 

Vi  and  rC  A  sends  to  S  Sig(xi)  and  Ci,  c'  for  i  G  [1,  fi]. 

(c)  A  proves  in  ZK  the  statement  in  equation  2  using  private  in¬ 
puts  xi,bi, . .  .^bti^ri, . . rti .  For  each  2  G  [1,  ti],  A  also 
proves  in  ZK  the  statement  in  equation  4  using  private  inputs 

2.  S  computes  wire  labels  ^9  =  PRF(fc2,  i  —  t\)  and  £j=i9(l)A 
for  is  [fi  +  1,  m],  S  then  construct  garbled  gates  Qf  and  sends 
Q f  and  A’s  commitments  c'  for  i  S  [1,  ii]  to  B. 

3.  S  and  B  engage  in  t2  instances  of  l-out-of-2  OT  as  in  Protocol 

2  together  with  verification  of  B’s  input.  Before  B  can  learn  la¬ 
bels  ®  forms  t2  commitments  c”  =  Com(6i,r'')  using 

fresh  randomness  r"  and  proves  in  ZK  the  statements  in  equa¬ 
tions  2  and  3  using  private  input  X2,bi, . . bt^  and 

bi,r'9 ,  ki,  respectively.  Here  ki  denotes  the  value  chosen  during 
step  2  of  the  ith  instance  of  the  OT  protocol. 

4.  A  opens  commitments  c'  by  sending  to  B  pairs  ,  r')  for  i  £ 
[1,  ii].  B  checks  whether  Com(<?^,  r')  =  c'  for  each  i  and  aborts 
if  at  least  one  check  fails. 

5.  The  remaining  steps  are  the  same  as  in  Protocol  2. 


bi  of  the  output,  if  bi  =  0,  Sa  sends  to  Aa  the  pair  {£i,£'^), 
otherwise,  Sa  sends  the  pair 

Now  we  examine  the  view  of  Aa  in  the  real  and  ideal 
model  executions  and  correctness  of  the  output.  After  receiv¬ 
ing  the  label  pairs  from  A  a,  Sa  performs  the  same  checks  on 
them  as  S  would  and  thus  both  would  abort  in  the  same  cir¬ 
cumstances.  Similarly,  if  A  a  provides  malformed  labels  for 
circuit  evaluation,  Sa  will  immediately  detect  this  in  the  ideal 
model  and  abort,  while  B  in  the  real  world  will  be  unable  to 
evaluate  the  circuit  and  also  abort.  Otherwise,  in  both  cases  the 
function  will  be  correctly  evaluated  on  the  input  provided  by 
A  A  and  B’s  input.  In  the  remaining  interaction,  A  a  sees  only 
random  values,  which  in  the  ideal  world  are  constructed  con¬ 
sistently  with  A  a’s  view  in  the  real  model  execution.  Thus, 
Ayi’s  view  is  indistinguishable  in  the  two  executions. 

Let  us  now  consider  malicious  Ab,  for  which  we  con¬ 
struct  simulator  Sb  in  the  ideal  model  execution  who  simu¬ 
lates  correct  behavior  of  A  and  S.  First,  Sb  simulates  the  OT. 
It  records  the  input  bits  used  hy  Ab  during  the  simulation, 
which  it  stores  as  x'2  and  returns  £2  random  labels  to  Ab  -  Sb 
also  sends  another  set  of  random  labels  to  Sb-  Sb  queries 
the  TP  for  Ab’s  output  f(xi,X2)  and  chooses  a  pair  of  ran¬ 
dom  labels  (£9 ,  £j )  for  each  bit  i  of  the  output.  Sb  gives  to  As 
a  simulated  garbled  circuit  (as  described  in  [55])  so  that  the  ith 


Table  10.  Performance  of  protocol  1  (with  half-gates);  work  Is  In 
ms. 


Function 

Input 

size 

Par 

ty 

Computation 

Communication  | 

offl. 

onl. 

total 

sent 

recvd 

total 

AES 

128 

A 

0.02 

- 

0.02 

2KB 

0.01KB 

2.01KB 

B 

0.42 

- 

0.42 

182KB 

0.11KB 

182KB 

S 

- 

0.31 

0.31 

0.11KB 

184KB 

184KB 

Hamming 

distance 

(bits) 

212 

A 

0.06 

- 

0.06 

64KB 

0 

64KB 

B 

0.30 

- 

0.30 

192KB 

0.2KB 

192KB 

S 

- 

0.23 

0.23 

0.19KB 

256KB 

256KB 

213 

A 

0.11 

- 

0.11 

128KB 

0 

128KB 

B 

0.65 

- 

0.65 

381KB 

0.2KB 

381KB 

S 

- 

0.52 

0.52 

0.2KB 

509KB 

509KB 

214 

A 

0.22 

- 

0.22 

256KB 

0 

256KB 

B 

1.65 

- 

1.65 

768KB 

0.4KB 

768KB 

S 

- 

1.39 

1.39 

0.4KB 

1MB 

1MB 

Matrix 

muitipii- 

cation 

(n  X  n 

ints) 

4 

A 

0.01 

- 

0.01 

8KB 

0.06KB 

8.1KB 

B 

13.4 

- 

13.4 

6.01MB 

8KB 

6.02MB 

S 

- 

9.78 

9.78 

8KB 

6.02MB 

6.02MB 

8 

A 

0.03 

- 

0.03 

32KB 

0.25KB 

32.2KB 

B 

107 

- 

107 

48.0MB 

32KB 

48.1MB 

S 

- 

78.2 

78.2 

32KB 

48.1MB 

48.1MB 

16 

A 

0.11 

- 

0.11 

128KB 

1KB 

129KB 

B 

858 

- 

858 

384MB 

128KB 

384MB 

S 

- 

621 

621 

128KB 

384MB 

384MB 

Edit 

distance 

(chars) 

32 

A 

0.00 

- 

0.00 

4KB 

0 

4KB 

B 

6.97 

- 

6.97 

4.61MB 

0.08KB 

4.61MB 

S 

- 

5.17 

5.17 

0.08KB 

4.6MB 

4.62MB 

64 

A 

0.01 

- 

0.01 

8KB 

0 

8KB 

B 

27.9 

- 

27.9 

18.4MB 

0.9KB 

18.4MB 

S 

- 

20.6 

20.6 

0.9KB 

18.4KB 

19.3KB 

128 

A 

0.01 

- 

0.01 

16KB 

0 

16KB 

B 

113 

- 

113 

73.8MB 

0.1KB 

73.8MB 

S 

— 

83.6 

83.6 

0.11KB 

73.8MB 

73.8MB 

computed  output  label  corresponds  to  the  ith  bit  of  f(xi ,  a;^)- 
If  after  circuit  evaluation,  Ab  does  not  send  the  correct  output 
labels  to  Sb,  Sb  aborts  the  execution.  Otherwise,  Sb  sends 
the  pairs  Ab- 

The  only  difference  between  the  view  of  Ab  in  the  real 
model  and  the  view  simulated  by  Sb  in  the  ideal  model  is  that 
Ab  evaluates  a  simulated  circuit  in  the  ideal  model.  Computa¬ 
tional  indistinguishability  of  the  simulated  circuit  follows  from 
the  security  proofs  of  Yao’s  garbled  circuit  construction  [55]. 
Thus,  Ab  is  unable  to  tell  the  two  worlds  apart. 

It  is  also  straightforward  to  simulate  the  view  of  semi- 
honest  As  because  it  has  no  input  and  receives  no  output.  □ 

D  Additional  Results 

Table  10  provides  the  results  of  running  protocol  1  using  rep¬ 
resentative  functions  with  the  half-gates  optimization. 


